Re: Reorderbuffer crash during recovery

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: Reorderbuffer crash during recovery
Дата
Msg-id CAFiTN-umcBr=SiQcf8TZq5dPQsiCYBfiWuU4AXX5MHUNnvp0jQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Reorderbuffer crash during recovery  (vignesh C <vignesh21@gmail.com>)
Ответы Re: Reorderbuffer crash during recovery  (vignesh C <vignesh21@gmail.com>)
Re: Reorderbuffer crash during recovery  (vignesh C <vignesh21@gmail.com>)
Список pgsql-bugs
On Tue, Dec 31, 2019 at 11:35 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Dec 30, 2019 at 11:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Dec 27, 2019 at 8:37 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> > >
> > > On 2019-Dec-27, vignesh C wrote:
> > >
> > > > I felt amit solution also solves the problem. Attached patch has the
> > > > fix based on the solution proposed.
> > > > Thoughts?
> > >
> > > This seems a sensible fix to me, though I didn't try to reproduce the
> > > failure.
> > >
> > > > @@ -2472,6 +2457,7 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
> > > >               }
> > > >
> > > >               ReorderBufferSerializeChange(rb, txn, fd, change);
> > > > +             txn->final_lsn = change->lsn;
> > > >               dlist_delete(&change->node);
> > > >               ReorderBufferReturnChange(rb, change);
> > >
> > > Should this be done insider ReorderBufferSerializeChange itself, instead
> > > of in its caller?
> > >
> >
> > makes sense.  But, I think we should add a comment specifying the
> > reason why it is important to set final_lsn while serializing the
> > change.
>
> Fixed
>
> > >  Also, would it be sane to verify that the TXN
> > > doesn't already have a newer final_lsn?  Maybe as an Assert.
> > >
> >
> > I don't think this is a good idea because we update the final_lsn with
> > commit_lsn in ReorderBufferCommit after which we can try to serialize
> > the remaining changes.  Instead, we should update it only if the
> > change_lsn value is greater than final_lsn.
> >
>
> Fixed.
> Thanks Alvaro & Amit for your suggestions. I have made the changes
> based on your suggestions. Please find the updated patch for the same.
> I have also verified the patch in back branches. Separate patch was
> required for Release-10 branch, patch for the same is attached as
> 0001-Reorder-buffer-crash-while-aborting-old-transactions-REL_10.patch.
> Thoughts?

One minor comment.  Otherwise, the patch looks fine to me.
+ /*
+ * We set final_lsn on a transaction when we decode its commit or abort
+ * record, but we never see those records for crashed transactions.  To
+ * ensure cleanup of these transactions, set final_lsn to that of their
+ * last change; this causes ReorderBufferRestoreCleanup to do the right
+ * thing. Final_lsn would have been set with commit_lsn earlier when we
+ * decode it commit, no need to update in that case
+ */
+ if (txn->final_lsn < change->lsn)
+ txn->final_lsn = change->lsn;

/decode it commit,/decode its commit,

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Pendekar Dikala Senja
Дата:
Сообщение: Re: BUG #16205: background worker "logical replication worker" (PID25218) was terminated by signal 11: Segmentation
Следующее
От: Pendekar Dikala Senja
Дата:
Сообщение: Re: BUG #16205: background worker "logical replication worker" (PID25218) was terminated by signal 11: Segmentation