Re: Deadlock between backend and recovery may not be detected

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: Deadlock between backend and recovery may not be detected
Дата
Msg-id 0bba5bf2-cd08-1038-09bd-b95350e89f0d@oss.nttdata.com
обсуждение исходный текст
Ответ на Re: Deadlock between backend and recovery may not be detected  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
Ответы Re: Deadlock between backend and recovery may not be detected  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers

On 2020/12/16 23:28, Drouvot, Bertrand wrote:
> Hi,
> 
> On 12/16/20 2:36 PM, Victor Yegorov wrote:
>>
>> *CAUTION*: This email originated from outside of the organization. Do not click links or open attachments unless you
canconfirm the sender and know the content is safe.
 
>>
>>
>> ср, 16 дек. 2020 г. в 13:49, Fujii Masao <masao.fujii@oss.nttdata.com <mailto:masao.fujii@oss.nttdata.com>>:
>>
>>     After doing this procedure, you can see the startup process and backend
>>     wait for the table lock each other, i.e., deadlock. But this deadlock remains
>>     even after deadlock_timeout passes.
>>
>>     This seems a bug to me.
>>
> +1
> 
>>
>>     > * Deadlocks involving the Startup process and an ordinary backend process
>>     > * will be detected by the deadlock detector within the ordinary backend.
>>
>>     The cause of this issue seems that ResolveRecoveryConflictWithLock() that
>>     the startup process calls when recovery conflict on lock happens doesn't
>>     take care of deadlock case at all. You can see this fact by reading the above
>>     source code comment for ResolveRecoveryConflictWithLock().
>>
>>     To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
>>     timer in ResolveRecoveryConflictWithLock() so that the startup process can
>>     send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
>>     Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
>>     the backend should check whether the deadlock actually happens or not.
>>     Attached is the POC patch implimenting this.
>>
> good catch!
> 
> I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't be set in
ResolveRecoveryConflictWithLock()too (it is already set in ResolveRecoveryConflictWithBufferPin()).
 
> 
> So + 1 to consider this as a bug and for the way the patch proposes to fix it.

Thanks Victor and Bertrand for agreeing!
Attached is the updated version of the patch.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: SELECT INTO deprecation
Следующее
От: Justin Pryzby
Дата:
Сообщение: Re: pg_upgrade test for binary compatibility of core data types