Re: MultiXact\SLRU buffers configuration
От | Andrey M. Borodin |
---|---|
Тема | Re: MultiXact\SLRU buffers configuration |
Дата | |
Msg-id | 102F673B-C9C5-4823-A92F-96799FC61ED0@yandex-team.ru обсуждение исходный текст |
Ответ на | Re: MultiXact\SLRU buffers configuration (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Ответы |
Re: MultiXact\SLRU buffers configuration
|
Список | pgsql-hackers |
> 14 мая 2020 г., в 11:16, Kyotaro Horiguchi <horikyota.ntt@gmail.com> написал(а): > > At Thu, 14 May 2020 10:19:42 +0500, "Andrey M. Borodin" <x4mmm@yandex-team.ru> wrote in >>>> I'm looking more at MultiXact and it seems to me that we have a race condition there. >>>> >>>> When we create a new MultiXact we do: >>>> 1. Generate new MultiXactId under MultiXactGenLock >>>> 2. Record new mxid with members and offset to WAL >>>> 3. Write offset to SLRU under MultiXactOffsetControlLock >>>> 4. Write members to SLRU under MultiXactMemberControlLock >>> >>> But, don't we hold exclusive lock on the buffer through all the steps >>> above? >> Yes...Unless MultiXact is observed on StandBy. This could lead to observing inconsistent snapshot: one of lockers committedtuple delete, but standby sees it as alive. > > Ah, right. I looked from GetNewMultiXactId. Actually > XLOG_MULTIXACT_CREATE_ID is not protected from concurrent reference to > the creating mxact id. And GetMultiXactIdMembers is considering that > case. > >>>> When we read MultiXact we do: >>>> 1. Retrieve offset by mxid from SLRU under MultiXactOffsetControlLock >>>> 2. If offset is 0 - it's not filled in at step 4 of previous algorithm, we sleep and goto 1 >>>> 3. Retrieve members from SLRU under MultiXactMemberControlLock >>>> 4. ..... what we do if there are just zeroes because step 4 is not executed yet? Nothing, return empty members list. >>> >>> So transactions never see such incomplete mxids, I believe. >> I've observed sleep in step 2. I believe it's possible to observe special effects of step 4 too. >> Maybe we could add lock on standby to dismiss this 1000us wait? Sometimes it hits hard on Standbys: if someone is lockingwhole table on primary - all seq scans on standbys follow him with MultiXactOffsetControlLock contention. > > GetMultiXactIdMembers believes that 4 is successfully done if 2 > returned valid offset, but actually that is not obvious. > > If we add a single giant lock just to isolate ,say, > GetMultiXactIdMember and RecordNewMultiXact, it reduces concurrency > unnecessarily. Perhaps we need finer-grained locking-key for standby > that works similary to buffer lock on primary, that doesn't cause > confilicts between irrelevant mxids. > We can just replay members before offsets. If offset is already there - members are there too. But I'd be happy if we could mitigate those 1000us too - with a hint about last maixd state in a shared MX state, for example. Actually, if we read empty mxid array instead of something that is replayed just yet - it's not a problem of inconsistency,because transaction in this mxid could not commit before we started. ISTM. So instead of fix, we, probably, can just add a comment. If this reasoning is correct. Best regards, Andrey Borodin.
В списке pgsql-hackers по дате отправления: