Re: VM corruption on standby

Поиск

Список

Период

Сортировка

От	Aleksander Alekseev
Тема	Re: VM corruption on standby
Дата	9 августа 23:54:42
Msg-id	CAJ7c6TMpt9Cr+M2_G97iKp_-TfLNm7ZOtHWyTVpdQKmocxchHw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: VM corruption on standby (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы	Re: VM corruption on standby Re: VM corruption on standby
Список	pgsql-hackers

Дерево обсуждения

Hi Andrey,

> 0. checkpointer is going to flush a heap buffer but waits on content lock
> 1. client is resetting PD_ALL_VISIBLE from page
> 2. postmaster is killed and command client to go down
> 3. client calls LWLockReleaseAll() at ProcKill() (?)
> 4. checkpointer flushes buffer with reset PG_ALL_VISIBLE that is not WAL-logged to standby
> 5. subsequent deletes do not log resetting this bit
> 6. deleted data is observable on standby with IndexOnlyScan

Thanks for investigating this in more detail. If this is indeed what
happens it is a violation of the "log before changing" approach. For
this reason we have PageHeaderData.pd_lsn for instance - to make sure
pages are evicted only *after* the record that changed it is written
to disk (because WAL records can't be applied to pages from the
future).

I guess the intent here could be to do an optimization of some sort
but the facts that 1. the instance can be killed at any time and 2.
there might be replicas - were not considered.

> Any idea how to fix this?

IMHO: logging the changes first, then allowing to evict the page.

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: VM corruption on standby