Re: [bug fix] Cascaded standby cannot start after a clean shutdown

Поиск

Список

Период

Сортировка

От	Michael Paquier
Тема	Re: [bug fix] Cascaded standby cannot start after a clean shutdown
Дата	26 февраля 2018 г. 15:19:21
Msg-id	20180226091921.GG6927@paquier.xyz обсуждение исходный текст
Ответ на	Re: [bug fix] Cascaded standby cannot start after a clean shutdown (Michael Paquier <michael@paquier.xyz>)
Ответы	RE: [bug fix] Cascaded standby cannot start after a clean shutdown ("Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, Feb 26, 2018 at 05:08:49PM +0900, Michael Paquier wrote:
> This was mentioned back in 2001 by the way, but this did not count much
> for the case discussed here:
> https://www.postgresql.org/message-id/24901.995381770%40sss.pgh.pa.us
> The issue here is that the streaming case makes it easier to hit the
> problem as it opens more easily access to not-completely written WAL
> pages depending on the message frequency during replication.  At the
> same time, we are discussing about a very low-probability issue.  Note
> that if the XLOG reader is bumping into this problem, then at the next
> WAL receiver wake up, recovery would begin from the beginning of the
> last segment, and if the primary has produced some more WAL then the
> standby would be able to actually avoid the random junk.  It is also
> possible to bypass the problem by zeroing manually the areas in
> question, or to actually wait for the standby to generate more WAL so as
> the garbage is overwritten automatically.  And you really need to be
> very, very unlucky to have random garbage able to bypass the header
> validation checks.

By the way, as long as I have my mind of it.  Another strategy would be
to just make the checks in XLogReadRecord() a bit smarter if the whole
record header is not on the page.  If we check at least for
AllocSizeIsValid(total_len) then there this code would not fail on an
allocation as you user reported.  Still this misses the case where a
record size is lower than 1GB but invalid so you would allocate
allocate_recordbuf for nothing :(

At least this extra check is costless, and avoids all kind of hard
failures.
--
Michael

Вложения

signature.asc

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Masahiko Sawada
Дата: 26 февраля 2018 г., 15:00:28
Сообщение: Re: [HACKERS] [PATCH] Vacuum: Update FSM more frequently

Следующее

От: Michael Paquier
Дата: 26 февраля 2018 г., 15:24:02
Сообщение: Re: [bug fix] pg_rewind takes long time because it mistakenly copiesdata files

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [bug fix] Cascaded standby cannot start after a clean shutdown

Вложения

Предыдущее

Следующее