Re: Time delayed LR (WAS Re: logical replication restrictions)

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Time delayed LR (WAS Re: logical replication restrictions)
Дата
Msg-id CAA4eK1+Zfk=dvp+sxsQkHfxp_GtpxqFi46_fiuqGMfiaRzeYFQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Time delayed LR (WAS Re: logical replication restrictions)  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Time delayed LR (WAS Re: logical replication restrictions)  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Fri, May 12, 2023 at 7:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, May 11, 2023 at 2:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Apr 28, 2023 at 2:35 PM Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> > >
> > > Dear hackers,
> > >
> > > I rebased and refined my PoC. Followings are the changes:
> > >
> >
> > 1. Is my understanding correct that this patch creates the delay files
> > for each transaction? If so, did you consider other approaches such as
> > using one file to avoid creating many files?
> > 2. For streaming transactions, first the changes are written in the
> > temp file and then moved to the delay file. It seems like there is a
> > double work. Is it possible to unify it such that when min_apply_delay
> > is specified, we just use the delay file without sacrificing the
> > advantages like stream sub-abort can truncate the changes?
> > 3. Ideally, there shouldn't be a performance impact of this feature on
> > regular transactions because the delay file is created only when
> > min_apply_delay is active but better to do some testing of the same.
> >
>
> In addition to the points Amit raised, if the 'required_schema' option
> is specified in START_REPLICATION, the publisher sends schema
> information for every change. I think it leads to significant
> overhead. Did you consider alternative approaches such as sending the
> schema information for every transaction or the subscriber requests
> the publisher to send it?
>

Why do we need this new flag? I can't see any comments in the related
code which explain its need.

> > Overall, I think such an approach can address comments by Sawada-San
> > [1] but not sure if Sawada-San or others have any better ideas to
> > achieve this feature. It would be good to see what others think of
> > this approach.
> >
>
> I agree with this approach.
>
> When it comes to the idea of writing logical changes to permanent
> files, I think it would also be a good idea (and perhaps could be a
> building block of this feature) that we write streamed changes to a
> permanent file so that the apply worker can retry to apply them
> without retrieving the same changes again from the publisher.
>

I think we anyway won't be able to send confirmation till we write or
process the commit. If it gets interrupted anytime in between we need
to get all the changes again. I think using Fileset with temp files
has quite a few advantages for streaming as are noted in the header
comments of worker.c. We can investigate to replace that with
permanent files but I don't see that the advantages outweigh the
change. Also, after parallel apply, I am expecting, most users would
prefer that mode for large transactions, so making changes in the
serialized path doesn't seem like a good idea to me.

Having said that, I also thought that it would be a good idea if both
streaming and time-delayed can use the same code path in some way
w.r.t writing to files but couldn't come up with any good idea without
more downsides. I see that Kuroda-San has tried to keep the code path
isolated for this feature but still see that one can question the
implementation approach.

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kirk Wolak
Дата:
Сообщение: Re: psql tests hangs
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Time delayed LR (WAS Re: logical replication restrictions)