Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.

Поиск
Список
Период
Сортировка
От Merlin Moncure
Тема Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.
Дата
Msg-id CAHyXU0zUQsBFePdi2Yj+T40j3O8AeG30RSQO42-6NpbCU9Xx1A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: [PERFORM] Cpu usage 100% on slave. s_lock problem.  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Список pgsql-hackers
On Thu, Nov 21, 2013 at 10:37 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 21.11.2013 17:08, Merlin Moncure wrote:
>>
>> On Thu, Nov 21, 2013 at 9:02 AM, Andres Freund <andres@2ndquadrant.com>
>> wrote:
>>>
>>> On 2013-11-21 16:25:02 +0200, Heikki Linnakangas wrote:
>>>>
>>>> Hmm. All callers of RecoveryInProgress() must be prepared to handle the
>>>> case
>>>> that RecoveryInProgress() returns true, but the system is no longer in
>>>> recovery. No matter what locking we do in RecoveryInProgress(), the
>>>> startup
>>>> process might finish recovery just after RecoveryInProgress() has
>>>> returned.
>>>
>>>
>>> True.
>>>
>>>> What about the attached? It reads the shared variable without a lock or
>>>> barrier. If it returns 'true', but the system in fact just exited
>>>> recovery,
>>>> that's OK. As explained above, all the callers must tolerate that
>>>> anyway.
>>>> But if it returns 'false', then it performs a full memory barrier, which
>>>> should ensure that it sees any other shared variables as it is after the
>>>> startup process cleared SharedRecoveryInProgress (notably,
>>>> XLogCtl->ThisTimeLineID).
>>>
>>>
>>> I'd argue that we should also remove the spinlock in StartupXLOG and
>>> replace it with a write barrier. Obviously not for performance reasons,
>>> but because somebody might add more code to run under that spinlock.
>>>
>>> Looks good otherwise, although a read memory barrier ought to suffice.
>>
>>
>> This code is in a very hot code path.  Are we *sure* that the read
>> barrier is fast enough that we don't want to provide an alternate
>> function that only returns the local flag?  I don't know enough about
>> them to say either way.
>
>
> In my patch, I put the barrier inside the if (!LocalRecoveryInProgress)
> block. That codepath can only execute once in a backend, so performance is
> not an issue there. Does that look sane to you?

oh right -- certainly!

merlin



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: new unicode table border styles for psql
Следующее
От: Andres Freund
Дата:
Сообщение: MultiXact truncation, startup et al.