Re: long-standing data loss bug in initial sync of logical replication

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: long-standing data loss bug in initial sync of logical replication
Дата
Msg-id CAA4eK1+9_iyZRr0TJjLb9HK3uVzQr31UCq=obqDph5JtE=cjLA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: long-standing data loss bug in initial sync of logical replication  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-hackers
On Wed, Jun 26, 2024 at 4:57 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 6/25/24 07:04, Amit Kapila wrote:
> > On Mon, Jun 24, 2024 at 8:06 PM Tomas Vondra
> > <tomas.vondra@enterprisedb.com> wrote:
> >>
> >> On 6/24/24 12:54, Amit Kapila wrote:
> >>> ...
> >>>>
> >>>>>> I'm not sure there are any cases where using SRE instead of AE would cause
> >>>>>> problems for logical decoding, but it seems very hard to prove. I'd be very
> >>>>>> surprised if just using SRE would not lead to corrupted cache contents in some
> >>>>>> situations. The cases where a lower lock level is ok are ones where we just
> >>>>>> don't care that the cache is coherent in that moment.
> >>>>
> >>>>> Are you saying it might break cases that are not corrupted now? How
> >>>>> could obtaining a stronger lock have such effect?
> >>>>
> >>>> No, I mean that I don't know if using SRE instead of AE would have negative
> >>>> consequences for logical decoding. I.e. whether, from a logical decoding POV,
> >>>> it'd suffice to increase the lock level to just SRE instead of AE.
> >>>>
> >>>> Since I don't see how it'd be correct otherwise, it's kind of a moot question.
> >>>>
> >>>
> >>> We lost track of this thread and the bug is still open. IIUC, the
> >>> conclusion is to use SRE in OpenTableList() to fix the reported issue.
> >>> Andres, Tomas, please let me know if my understanding is wrong,
> >>> otherwise, let's proceed and fix this issue.
> >>>
> >>
> >> It's in the commitfest [https://commitfest.postgresql.org/48/4766/] so I
> >> don't think we 'lost track' of it, but it's true we haven't done much
> >> progress recently.
> >>
> >
> > Okay, thanks for pointing to the CF entry. Would you like to take care
> > of this? Are you seeing anything more than the simple fix to use SRE
> > in OpenTableList()?
> >
>
> I did not find a simpler fix than adding the SRE, and I think pretty
> much any other fix is guaranteed to be more complex. I don't remember
> all the details without relearning all the details, but IIRC the main
> challenge for me was to convince myself it's a sufficient and reliable
> fix (and not working simply by chance).
>
> I won't have time to look into this anytime soon, so feel free to take
> care of this and push the fix.
>

Okay, I'll take care of this.

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Melanie Plageman
Дата:
Сообщение: Re: Add LSN <-> time conversion functionality
Следующее
От: Nisha Moond
Дата:
Сообщение: Re: Conflict Detection and Resolution