Re: [PATCH] Replica sends an incorrect epoch in its hot standbyfeedback to the Master

Поиск
Список
Период
Сортировка
От Palamadai, Eka
Тема Re: [PATCH] Replica sends an incorrect epoch in its hot standbyfeedback to the Master
Дата
Msg-id 1878F176-63E7-4D6A-8894-1FC1B7BB5480@amazon.com
обсуждение исходный текст
Ответ на Re: [PATCH] Replica sends an incorrect epoch in its hot standbyfeedback to the Master  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: [PATCH] Replica sends an incorrect epoch in its hot standbyfeedback to the Master  (Juan José Santamaría Flecha <juanjo.santamaria@gmail.com>)
Re: [PATCH] Replica sends an incorrect epoch in its hot standbyfeedback to the Master  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
Thanks a lot for the feedback. Please let me know if you have any further comments. Meanwhile, I have also added this
patchto "Commitfest 2020-03" at https://commitfest.postgresql.org/27/2464.
 

Thanks,
Eka Palamadai
Amazon Web Services

On 2/11/20, 11:28 PM, "Thomas Munro" <thomas.munro@gmail.com> wrote:

    On Fri, Feb 7, 2020 at 1:03 PM Palamadai, Eka <ekanatha@amazon.com> wrote:
    > The below problem occurs in Postgres versions 11, 10, and 9.6. However, it doesn’t occur since Postgres version
12,since the commit [6] to add basic infrastructure for 64-bit transaction IDs indirectly fixed it.
 
    
    I'm happy that that stuff is already fixing bugs we didn't know we
    had, but, yeah, it looks like it really only fixed it incidentally by
    moving all the duplicated "assign if higher" code into a function, not
    through the magical power of 64 bit xids.
    
    > The replica sends an incorrect epoch in its hot standby feedback to the master in the scenario outlined below,
wherea checkpoint is interleaved with the execution of 2 transactions at the master. The incorrect epoch in the
feedbackcauses the master to ignore the “oldest Xmin” X sent by the replica. If a heap page prune[1] or vacuum were
executedat the master immediately thereafter, they may use a newer “oldest Xmin” Y > X,  and prematurely delete a tuple
Tsuch that X < t_xmax (T) < Y, which is still in use at the replica as part of a long running read query Q.
Subsequently,when the replica replays the deletion of T as part of its WAL replay, it cancels the long running query Q
causingunnecessary pain to customers.
 
    
    Ouch.  Thanks for this analysis!
    
    > The variable “ShmemVariableCache->nextXid” (or “nextXid” for short) should be monotonically increasing unless it
wrapsaround to the next epoch. However, in the above sequence, this property is violated on the replica in the function
“RecordKnownAssignedTransactionIds”[3],when the WAL replay for the insertion at step 6 is executed at the replica.
 
    
    I haven't tried your repro or studied this closely yet, but yes, that
    assignment to nextXid does indeed look pretty fishy.  Other similar
    code elsewhere always does a check like in your patch, before
    clobbering nextXid.
    


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Konstantin Knizhnik
Дата:
Сообщение: Re: Yet another vectorized engine
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Fix compiler warnings on 64-bit Windows