Recovery bug in GIN, missing full page image

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Recovery bug in GIN, missing full page image
Дата
Msg-id 529DED09.3080501@vmware.com
обсуждение исходный текст
Ответы Re: Recovery bug in GIN, missing full page image
Список pgsql-hackers
While looking at Alexander's GIN patch, I noticed an ancient bug in the
WAL-logging of GIN entry-tree insertions. entryPlaceToPage and
dataPlacetoPage functions don't make a full-page image of the page, when
inserting a downlink on a non-leaf page. The comment says:

>     /*
>      * Prevent full page write if child's split occurs. That is needed to
>      * remove incomplete splits while replaying WAL
>      *
>      * data.updateBlkno contains new block number (of newly created right
>      * page) for recently splited page.
>      */

The code is doing what the comment says, but that's wrong. You can't
just skip the full page write, it's needed for torn page protection like
in any other case. The correct fix would've been to change the
redo-routine to do the incomplete split tracking for the page even if
it's restored from a full page image.

This was broken by this commit back in 2007:

> commit 853d1c3103fa961ae6219f0281885b345593d101
> Author: Teodor Sigaev <teodor@sigaev.ru>
> Date:   Mon Jun 4 15:56:28 2007 +0000
>
>     Fix bundle bugs of GIN:
>     - Fix possible deadlock between UPDATE and VACUUM queries. Bug never was
>       observed in 8.2, but it still exist there. HEAD is more sensitive to
>       bug after recent "ring" of buffer improvements.
>     - Fix WAL creation: if parent page is stored as is after split then
>       incomplete split isn't removed during replay. This happens rather rare, only
>       on large tables with a lot of updates/inserts.
>     - Fix WAL replay: there was wrong test of XLR_BKP_BLOCK_* for left
>       page after deletion of page. That causes wrong rightlink field: it pointed
>       to deleted page.
>     - add checking of match of clearing incomplete split
>     - cleanup incomplete split list after proceeding
>
>     All of this chages doesn't change on-disk storage, so backpatch...
>     But second point may be an issue for replaying logs from previous version.

The relevant part is the "Fix WAL creation" item. I searched the
archives but couldn't find any discussion leading to this fix.

In 2010, Tom actually fixed the redo-routine in commit
4016bdef8aded77b4903c457050622a5a1815c16, along with other fixes. So all
we need to do now is to fix that bogus logic in entryPlaceToPage to not
skip the full-page-image.

Attached is a patch for 9.3. I've whacked the code a lot in master, but
the same bug is present there too.

- Heikki

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: Extension Templates S03E11
Следующее
От: Tom Lane
Дата:
Сообщение: Re: UNNEST with multiple args, and TABLE with multiple funcs