Incorrect error handling for two-phase state files resulting in dataloss

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Incorrect error handling for two-phase state files resulting in dataloss
Дата
Msg-id 20180709050309.GM1467@paquier.xyz
обсуждение исходный текст
Ответы Re: Incorrect error handling for two-phase state files resulting indata loss
Список pgsql-hackers
Hi all,

This is a follow-up of the following thread where I have touched the
topic of corrupted 2PC files being completely ignored by recovery:
https://www.postgresql.org/message-id/20180709012955.GD1467%40paquier.xyz
I have posted a patch on this thread, but after more reviews I have
noticed that it does not close all the holes.

What happens is that when a two-phase state file cannot be read by
recovery when the file is on disk, which is what ReadTwoPhaseFile does,
then its data is simply skipped.  It is perfectly possible that an
on-disk 2PC is missing at crash recovery if it has been already
processed, so if trying to open the file and seeing an ENOENT, then the
file can be properly skipped.  However, if you look at this API, it
skips *all* errors instead of the ones that matter.  For example, if a
file needs to be processed and is cannot be opened because of permission
issues, then it is simply skipped.  If its size, its CRC or its magic
number is incorrect, the same thing happens.  Skipping unconditionally
files this way can result in data loss.

I think that we really need to harden things, by making
ReadTwoPhaseFile() fail hard is it finds something unexpected, which is
in this case anything except trying to open a file which fails on
ENOENT, and that this stuff should be back-patched.

Thoughts?
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Imai, Yoshikazu"
Дата:
Сообщение: RE: Locking B-tree leafs immediately in exclusive mode
Следующее
От: Tatsuro Yamada
Дата:
Сообщение: Re: Postgres 11 release notes