Re: MultiXact\SLRU buffers configuration

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: MultiXact\SLRU buffers configuration
Дата
Msg-id 20200520.135404.64670166185539892.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: MultiXact\SLRU buffers configuration  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Список pgsql-hackers
At Fri, 15 May 2020 14:01:46 +0500, "Andrey M. Borodin" <x4mmm@yandex-team.ru> wrote in
>
>
> > 15 мая 2020 г., в 05:03, Kyotaro Horiguchi <horikyota.ntt@gmail.com> написал(а):
> >
> > At Thu, 14 May 2020 11:44:01 +0500, "Andrey M. Borodin" <x4mmm@yandex-team.ru> wrote in
> >>> GetMultiXactIdMembers believes that 4 is successfully done if 2
> >>> returned valid offset, but actually that is not obvious.
> >>>
> >>> If we add a single giant lock just to isolate ,say,
> >>> GetMultiXactIdMember and RecordNewMultiXact, it reduces concurrency
> >>> unnecessarily.  Perhaps we need finer-grained locking-key for standby
> >>> that works similary to buffer lock on primary, that doesn't cause
> >>> confilicts between irrelevant mxids.
> >>>
> >> We can just replay members before offsets. If offset is already there - members are there too.
> >> But I'd be happy if we could mitigate those 1000us too - with a hint about last maixd state in a shared MX state,
forexample. 
> >
> > Generally in such cases, condition variables would work.  In the
> > attached PoC, the reader side gets no penalty in the "likely" code
> > path.  The writer side always calls ConditionVariableBroadcast but the
> > waiter list is empty in almost all cases.  But I couldn't cause the
> > situation where the sleep 1000u is reached.
> Thanks! That really looks like a good solution without magic timeouts. Beautiful!
> I think I can create temporary extension which calls MultiXact API and tests edge-cases like this 1000us wait.
> This extension will also be also useful for me to assess impact of bigger buffers, reduced read locking (as in my 2nd
patch)and other tweaks. 

Happy to hear that, It would need to use timeout just in case, though.

> >> Actually, if we read empty mxid array instead of something that is replayed just yet - it's not a problem of
inconsistency,because transaction in this mxid could not commit before we started. ISTM. 
> >> So instead of fix, we, probably, can just add a comment. If this reasoning is correct.
> >
> > The step 4 of the reader side reads the members of the target mxid. It
> > is already written if the offset of the *next* mxid is filled-in.
> Most often - yes, but members are not guaranteed to be filled in order. Those who win MXMemberControlLock will write
first.
> But nobody can read members of MXID before it is returned. And its members will be written before returning MXID.

Yeah, right.  Otherwise assertion failure happens.


regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: Is it useful to record whether plans are generic or custom?
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Problem with pg_atomic_compare_exchange_u64 at 32-bit platformwd