Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

Поиск

Список

Период

Сортировка

От	Andrey M. Borodin
Тема	Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING
Дата	20 июля 2020 г. 21:37:24
Msg-id	7DA5FBEF-6CD8-42F7-98DC-CC320EFE61DF@yandex-team.ru обсуждение исходный текст
Ответ на	Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING (Alvaro Herrera <alvherre@2ndquadrant.com>)
Ответы	Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения


> 20 июля 2020 г., в 21:44, Alvaro Herrera <alvherre@2ndquadrant.com> написал(а):
>
>> I think we shall do that in some cases
>> but IMHO it's not a very good idea in all the cases.  Basically, if
>> the xmin precedes the relfrozenxid then probably we should allow to
>> update the relfrozenxid whereas if the xmin precedes cutoff xid and
>> still uncommitted then probably we might stop relfrozenxid from being
>> updated so that we can stop CLOG from getting truncated.
>
> I'm not sure I understand 100% what you're talking about here (the first
> half seems dangerous unless you misspoke), but in any case it seems a
> pointless optimization.  I mean, if the heap is corrupted, you can hope
> to complete the vacuum (which will hopefully return which *other* tuples
> are similarly corrupt) but trying to advance relfrozenxid is a lost
> cause.

I think the point here is to actually move relfrozenxid back. But the mince can't be turned back. If CLOG is rotated -
thetable is corrupted beyond easy repair. 

I'm not sure it's Dilip's case, but I'll try to describe what I was encountering.

We were observing this kind of corruption in three cases:
1. With a bug in patched Linux kernel page cache we could loose FS page write
2. With a bug in WAL-G block-level incremental backup - we could loose update of the page.
3. With a firmware bug in SSD drives from one vendor - one write to block storage device was lost
One page in a database is of some non-latest version (but with correct checksum, it's just an old version). And in our
caseusually a VACUUMing of a page was lost (with freezes of all tuples). Some tuples are not marked as frozen, while VM
hasfrozen bit for page. Everything works just fine until someone updates a tuple on the same page: VM bit is reset and
eventuallyuser will try to consult CLOG, which is already truncated. 

This is why we may need to defer CLOG truncation or even move relfrozenxid back.

FWIW we coped with this by actively monitoring this kind of corruption with this amcheck patch [0]. One can observe
thislost page updates cheaply in indexes and act on first sight of corruption: identify source of the buggy behaviour. 

Dilip, does this sound like a corruption case you are working on?

Thanks!

Best regards, Andrey Borodin.

[0] https://commitfest.postgresql.org/24/2254/

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Tomas Vondra
Дата: 20 июля 2020 г., 20:25:39
Сообщение: Re: Default setting for enable_hashagg_disk

Следующее

От: "Andrey M. Borodin"
Дата: 20 июля 2020 г., 21:46:47
Сообщение: Re: Amcheck: do rightlink verification with lock coupling

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Allow ERROR from heap_prepare_freeze_tuple to be downgraded to WARNING

Предыдущее

Следующее