Re: IPC/MultixactCreation on the Standby server

Поиск
Список
Период
Сортировка
От Ivan Bykov
Тема Re: IPC/MultixactCreation on the Standby server
Дата
Msg-id 176399474623.1015.6638892143305450074.pgcf@coridan.postgresql.org
обсуждение исходный текст
Ответ на Re: IPC/MultixactCreation on the Standby server  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: IPC/MultixactCreation on the Standby server
Список pgsql-hackers
Hi, Andrey!

> Thanks for your review!

Review still in progress, sorry for the delay. I didn't have enough time to fully understand the changes you suggest,
butit seems there is 
 
only a small gap in my understanding of what the patch does. Here is my explanation of the problem.

The main problem
=============

The main problem is that we may lose session context before writing the offset to SLRU (but we may write 
a WAL record). It seems that the writer process got stuck in the XLogInsert procedure or even failed
between GetNewMultiXactId 
and RecordNewMultiXact call. In this case, readers will wait to receive a conditional variable signal (from new
multixacts)
 
but could not find a valid offset for the "failed" (it may be in WAL) multixid.

I illustrate this using the next diagram.

Writer                                        Reader             
--------------------------------------------------------------------------------
 MultiXactIdCreateFromMembers
 -> GetNewMultiXactId (101)
                                              GetMultiXactIdMembers(100)
                                              -> LWLockAcquire(MultiXactOffset)
                                              -> read offset 100
                                              -> read offset 101
                                              -> LWLockRelease(MultiXactOffset)
                                                 offset 101 == 0
                                              -> ConditionVariableSleep()
+--------------------------------------------------------------------------------------+
|-> XLogInsert                                                                         |
+--------------------------------------------------------------------------------------+
 -> RecordNewMultiXact
  -> LWLockAcquire(MultiXactOffset)
  -> write offset 101
  -> LWLockRelease(MultiXactOffset)
  -> ConditionVariableBroadcast(nextoff_cv);  
                                              -> retry:
                                              -> LWLockAcquire(MultiXactOffset)
                                              -> read offset 100
                                              -> read offset 101
                                              -> LWLockRelease(MultiXactOffset)
                                                 offset 101 != 0
                                              -> length = offset 101 - read offset 100
  -> LWLockAcquire(MultiXactMember)
  -> write members 101
  -> LWLockRelease(MultiXactOffset)

--------------------------------------------------------------------------------------

As far as I can see, your proposal seems to address exactly that problem.
The main difference from the former solution is writing to MultiXactOffset SLRU all required 
information for the reader atomically on multiact insertion. 
Before this change, we actually extended the multixact insertion time window to the next multixact 
insertion time, and it seems a risky design.

I illustrate the new solution using the next diagram.

Writer                                        Reader             
--------------------------------------------------------------------------------
 MultiXactIdCreateFromMembers
 -> GetNewMultiXactId (100)
 -> XLogInsert 
 -> RecordNewMultiXact
  -> LWLockAcquire(MultiXactOffset)
  -> write offset 100 
  -> write offset 101 *****************************************************************
  -> LWLockRelease(MultiXactOffset)
                                              GetMultiXactIdMembers(100)
                                              -> LWLockAcquire(MultiXactOffset)
                                              -> read offset 100
                                              -> read offset 101
                                              -> LWLockRelease(MultiXactOffset)
                                              Assert(offset 101 == 0)
                                              -> ConditionVariableSleep()
                                              -> length = offset 101 - read offset 100
--------------------------------------------------------------------------------------

So if I understand the core idea of your solution right, I think that the code in the last patch 
(v10-0001-Avoid-multixact-edge-case-2-by-writing-the-next-.patch) is correct and does what it should.

В списке pgsql-hackers по дате отправления: