Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node
Дата
Msg-id 20190927061414.GF8485@paquier.xyz
обсуждение исходный текст
Ответ на Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node  (Fujii Masao <masao.fujii@gmail.com>)
Ответы Re: PATCH: standby crashed when replay block which truncated instandby but failed to truncate in master node  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-hackers
On Thu, Sep 26, 2019 at 01:13:56AM +0900, Fujii Masao wrote:
> On Tue, Sep 24, 2019 at 10:41 AM Michael Paquier <michael@paquier.xyz> wrote:
>> This also points out that there are other things to worry about than
>> interruptions, as for example DropRelFileNodeLocalBuffers() could lead
>> to an ERROR, and this happens before the physical truncation is done
>> but after the WAL record is replayed on the standby, so any failures
>> happening at the truncation phase before the work is done would be a
>> problem.  However we are talking about failures which should not
>> happen and these are elog() calls.  It would be tempting to add a
>> critical section here, but we could still have problems if we have a
>> failure after the WAL record has been flushed, which means that it
>> would be replayed on the standby, and the surrounding comments are
>> clear about that.
>
> Could you elaborate what problem adding a critical section there occurs?

Wrapping the call of smgrtruncate() within RelationTruncate() to use a
critical section would make things worse from the user perspective on
the primary, no?  If the physical truncation fails, we would still
fail WAL replay on the standby, but instead of generating an ERROR in
the session of the user attempting the TRUNCATE, the whole primary
would be taken down.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: recovery starting when backup_label exists, but not recovery.signal