Re: Deadlock between backend and recovery may not be detected

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: Deadlock between backend and recovery may not be detected
Дата
Msg-id c414ad2e-ef75-1b60-3611-e8fe367e136d@oss.nttdata.com
обсуждение исходный текст
Ответ на Re: Deadlock between backend and recovery may not be detected  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Deadlock between backend and recovery may not be detected  (Masahiko Sawada <sawada.mshk@gmail.com>)
Re: Deadlock between backend and recovery may not be detected  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Список pgsql-hackers

On 2020/12/23 19:28, Masahiko Sawada wrote:
> On Tue, Dec 22, 2020 at 11:58 PM Fujii Masao
> <masao.fujii@oss.nttdata.com> wrote:
>>
>>
>>
>> On 2020/12/22 20:42, Fujii Masao wrote:
>>>
>>>
>>> On 2020/12/22 10:25, Masahiko Sawada wrote:
>>>> On Fri, Dec 18, 2020 at 6:36 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 2020/12/17 2:15, Fujii Masao wrote:
>>>>>>
>>>>>>
>>>>>> On 2020/12/16 23:28, Drouvot, Bertrand wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 12/16/20 2:36 PM, Victor Yegorov wrote:
>>>>>>>>
>>>>>>>> *CAUTION*: This email originated from outside of the organization. Do not click links or open attachments
unlessyou can confirm the sender and know the content is safe.
 
>>>>>>>>
>>>>>>>>
>>>>>>>> ср, 16 дек. 2020 г. в 13:49, Fujii Masao <masao.fujii@oss.nttdata.com <mailto:masao.fujii@oss.nttdata.com>>:
>>>>>>>>
>>>>>>>>       After doing this procedure, you can see the startup process and backend
>>>>>>>>       wait for the table lock each other, i.e., deadlock. But this deadlock remains
>>>>>>>>       even after deadlock_timeout passes.
>>>>>>>>
>>>>>>>>       This seems a bug to me.
>>>>>>>>
>>>>>>> +1
>>>>>>>
>>>>>>>>
>>>>>>>>       > * Deadlocks involving the Startup process and an ordinary backend process
>>>>>>>>       > * will be detected by the deadlock detector within the ordinary backend.
>>>>>>>>
>>>>>>>>       The cause of this issue seems that ResolveRecoveryConflictWithLock() that
>>>>>>>>       the startup process calls when recovery conflict on lock happens doesn't
>>>>>>>>       take care of deadlock case at all. You can see this fact by reading the above
>>>>>>>>       source code comment for ResolveRecoveryConflictWithLock().
>>>>>>>>
>>>>>>>>       To fix this issue, I think that we should enable STANDBY_DEADLOCK_TIMEOUT
>>>>>>>>       timer in ResolveRecoveryConflictWithLock() so that the startup process can
>>>>>>>>       send PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal to the backend.
>>>>>>>>       Then if PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK signal arrives,
>>>>>>>>       the backend should check whether the deadlock actually happens or not.
>>>>>>>>       Attached is the POC patch implimenting this.
>>>>>>>>
>>>>>>> good catch!
>>>>>>>
>>>>>>> I don't see any obvious reasons why the STANDBY_DEADLOCK_TIMEOUT shouldn't be set in
ResolveRecoveryConflictWithLock()too (it is already set in ResolveRecoveryConflictWithBufferPin()).
 
>>>>>>>
>>>>>>> So + 1 to consider this as a bug and for the way the patch proposes to fix it.
>>>>>>
>>>>>> Thanks Victor and Bertrand for agreeing!
>>>>>> Attached is the updated version of the patch.
>>>>>
>>>>> Attached is v3 of the patch. Could you review this version?
>>>>>
>>>>> While the startup process is waiting for recovery conflict on buffer pin,
>>>>> it repeats sending the request for deadlock check to all the backends
>>>>> every deadlock_timeout. This may increase the workload in the startup
>>>>> process and backends, but since this is the original behavior, the patch
>>>>> doesn't change that. Also in practice this may not be so harmful because
>>>>> the period that the buffer is kept pinned is basically not so long.
>>>>>
>>>>
>>>> @@ -529,6 +603,26 @@ ResolveRecoveryConflictWithBufferPin(void)
>>>>        */
>>>>       ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
>>>>
>>>> +   if (got_standby_deadlock_timeout)
>>>> +   {
>>>> +       /*
>>>> +        * Send out a request for hot-standby backends to check themselves for
>>>> +        * deadlocks.
>>>> +        *
>>>> +        * XXX The subsequent ResolveRecoveryConflictWithBufferPin() will wait
>>>> +        * to be signaled by UnpinBuffer() again and send a request for
>>>> +        * deadlocks check if deadlock_timeout happens. This causes the
>>>> +        * request to continue to be sent every deadlock_timeout until the
>>>> +        * buffer is unpinned or ltime is reached. This would increase the
>>>> +        * workload in the startup process and backends. In practice it may
>>>> +        * not be so harmful because the period that the buffer is kept pinned
>>>> +        * is basically no so long. But we should fix this?
>>>> +        */
>>>> +       SendRecoveryConflictWithBufferPin(
>>>> +
>>>> PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK);
>>>> +       got_standby_deadlock_timeout = false;
>>>> +   }
>>>> +
>>>>
>>>> Since SendRecoveryConflictWithBufferPin() sends the signal to all
>>>> backends every backend who is waiting on a lock at ProcSleep() and not
>>>> holding a buffer pin blocking the startup process will end up doing a
>>>> deadlock check, which seems expensive. What is worse is that the
>>>> deadlock will not be detected because the deadlock involving a buffer
>>>> pin isn't detected by CheckDeadLock(). I thought we can replace
>>>> PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK with
>>>> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN but it’s not good because the
>>>> backend who has a buffer pin blocking the startup process and not
>>>> waiting on a lock is also canceled after deadlock_timeout. We can have
>>>> the backend return from RecoveryConflictInterrupt() when it received
>>>> PROCSIG_RECOVERY_CONFLICT_BUFFERPIN and is not waiting on any lock,
>>>> but it’s also not good because we cannot cancel the backend after
>>>> max_standby_streaming_delay that has a buffer pin blocking the startup
>>>> process. So I wonder if we can have a new signal. When the backend
>>>> received this signal it returns from RecoveryConflictInterrupt()
>>>> without deadlock check either if it’s not waiting on any lock or if it
>>>> doesn’t have a buffer pin blocking the startup process. Otherwise it's
>>>> cancelled.
>>>
>>> Thanks for pointing out that issue! Using new signal is an idea. Another idea
>>> is to make a backend skip check the deadlock if GetStartupBufferPinWaitBufId()
>>> returns -1, i.e., the startup process is not waiting for buffer pin. So,
>>> what I'm thinkins is;
>>>
>>> In RecoveryConflictInterrupt(), when a backend receive
>>> PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
>>>
>>> 1. If a backend isn't waiting for a lock, it does nothing .
>>> 2. If a backend is waiting for a lock and also holding a buffer pin that
>>>      delays recovery, it may be canceled.
>>> 3. If a backend is waiting for a lock and the startup process is not waiting
>>>      for buffer pin (i.e., the startup process is also waiting for a lock),
>>>      it checks for the deadlocks.
>>> 4. If a backend is waiting for a lock and isn't holding a buffer pin that
>>>      delays recovery though the startup process is waiting for buffer pin,
>>>      it does nothing.
>>
> 
> Good idea! It could still happen that if the startup process sets
> startupBufferPinWaitBufId to -1 after sending the signal and before
> the backend checks it, the backend will end up doing an unmeaningful
> deadlock check. But the likelihood would be low in practice.
> 
> I have two small comments on ResolveRecoveryConflictWithBufferPin() in
> the v4 patch:
> 
> The code currently has three branches as follow:
> 
>      if (ltime == 0)
>      {
>          enable a timeout for deadlock;
>      }
>      else if (GetCurrentTimestamp() >= ltime)
>      {
>          send recovery conflict signal;
>      }
>      else
>      {
>          enable two timeouts: ltime and deadlock
>      }
> 
> I think we can rearrange the code similar to the changes you made on
> ResolveRecoveryConflictWithLock():
> 
>      if (GetCurrentTimestamp() >= ltime && ltime != 0)
>      {
>          Resolve recovery conflict;
>      }
>      else
>      {
>          Enable one or two timeouts: ltime and deadlock
>      }
> 
> It's more consistent with ResolveRecoveryConflictWithLock(). And
> currently the patch doesn't reset got_standby_deadlock_timeout in
> (ltime == 0) case but it will also be resolved by this rearrangement.

I didn't want to change the code structure as much as possible because
the patch needs to be back-patched. But it's good idea to make the code
structures in ResolveRecoveryConflictWithLock() and ...WithBufferPin() similar,
to make the code simpler and easier-to-read. So I agree with you. Attached
is the updated of the patch. What about this version?

> 
> ---
> If we always reset got_standby_deadlock_timeout before waiting it's
> not necessary but we might want to clear got_standby_deadlock_timeout
> also after disabling all timeouts to ensure that it's cleared at the
> end of the function. In ResolveRecoveryConflictWithLock() we clear
> both got_standby_lock_timeout and got_standby_deadlock_timeout after
> disabling all timeouts but we don't do that in
> ResolveRecoveryConflictWithBufferPin().

I agree that it's better to reset got_standby_deadlock_timeout after
all the timeouts are disabled. So I changed the patch that way. OTOH
got_standby_lock_timeout doesn't need to be reset because it's never
enabled in ResolveRecoveryConflictWithBufferPin(). No?

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Fail Fast In CTAS/CMV If Relation Already Exists To Avoid Unnecessary Rewrite, Planning Costs
Следующее
От: "k.jamison@fujitsu.com"
Дата:
Сообщение: RE: [Patch] Optimize dropping of relation buffers using dlist