Re: Add Information during standby recovery conflicts

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: Add Information during standby recovery conflicts
Дата
Msg-id 627f702d-a04c-b72b-b94f-214f448e2935@oss.nttdata.com
обсуждение исходный текст
Ответ на Re: Add Information during standby recovery conflicts  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Add Information during standby recovery conflicts  ("Drouvot, Bertrand" <bdrouvot@amazon.com>)
Список pgsql-hackers

On 2020/12/04 9:28, Masahiko Sawada wrote:
> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>>
>>
>> On 2020/12/01 17:29, Drouvot, Bertrand wrote:
>>> Hi,
>>>
>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote:
>>>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you
canconfirm the sender and know the content is safe.
 
>>>>
>>>>
>>>>
>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>>>>> On 2020-Dec-01, Fujii Masao wrote:
>>>>>
>>>>>> +                     if (proc)
>>>>>> +                     {
>>>>>> +                             if (nprocs == 0)
>>>>>> +                                     appendStringInfo(&buf, "%d", proc->pid);
>>>>>> +                             else
>>>>>> +                                     appendStringInfo(&buf, ", %d", proc->pid);
>>>>>> +
>>>>>> +                             nprocs++;
>>>>>>
>>>>>> What happens if all the backends in wait_list have gone? In other words,
>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs has not been
>>>>>> incrmented at all)? This would very rarely happen, but can happen.
>>>>>> In this case, since buf.data is empty, at least there seems no need to log
>>>>>> the list of conflicting processes in detail message.
>>>>> Yes, I noticed this too; this can be simplified by changing the
>>>>> condition in the ereport() call to be "nprocs > 0" (rather than
>>>>> wait_list being null), otherwise not print the errdetail.  (You could
>>>>> test buf.data or buf.len instead, but that seems uglier to me.)
>>>> +1
>>>>
>>>> Maybe we can also improve the comment of this function from:
>>>>
>>>> + * This function also reports the details about the conflicting
>>>> + * process ids if *wait_list is not NULL.
>>>>
>>>> to " This function also reports the details about the conflicting
>>>> process ids if exist" or something.
>>>>
>>> Thank you all for the review/remarks.
>>>
>>> They have been addressed in the new attached patch version.
>>
>> Thanks for updating the patch! I read through the patch again
>> and applied the following chages to it. Attached is the updated
>> version of the patch. Could you review this version? If there is
>> no issue in it, I'm thinking to commit this version.
> 
> Thank you for updating the patch! I have one question.
> 
>>
>> +                       timeouts[cnt].id = STANDBY_TIMEOUT;
>> +                       timeouts[cnt].type = TMPARAM_AFTER;
>> +                       timeouts[cnt].delay_ms = DeadlockTimeout;
>>
>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here?
>> I changed the code that way.
> 
> As the comment of ResolveRecoveryConflictWithLock() says the
> following, a deadlock is detected by the ordinary backend process:
> 
>   * Deadlocks involving the Startup process and an ordinary backend proces
>   * will be detected by the deadlock detector within the ordinary backend.
> 
> If we use STANDBY_DEADLOCK_TIMEOUT,
> SendRecoveryConflictWithBufferPin() will be called after
> DeadlockTimeout passed, but I think it's not necessary for the startup
> process in this case.

Thanks for pointing this! You are right.


> If we want to just wake up the startup process
> maybe we can use STANDBY_TIMEOUT here?

When STANDBY_TIMEOUT happens, a request to release conflicting buffer pins is sent. Right? If so, we should not also
useSTANDBY_TIMEOUT there?
 

Or, first of all, we don't need to enable the deadlock timer at all? Since what we'd like to do is to wake up after
deadlock_timeoutpasses, we can do that by changing ProcWaitForSignal() so that it can accept the timeout and giving the
deadlock_timeoutto it. If we do this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from
ResolveRecoveryConflictWithLock().Thought?
 

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jacob Champion
Дата:
Сообщение: Re: Support for NSS as a libpq TLS backend
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: scram-sha-256 broken with FIPS and OpenSSL 1.0.2