Re: replication_origin and replication_origin_lsn usage on subscriber

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: replication_origin and replication_origin_lsn usage on subscriber
Дата
Msg-id CAA4eK1KWNrUQGFeYoLKdHXfGybE62LV1cD1nq7K-N5=xkL=RcQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: replication_origin and replication_origin_lsn usage on subscriber  (Petr Jelinek <petr@2ndquadrant.com>)
Ответы Re: replication_origin and replication_origin_lsn usage on subscriber  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
On Thu, Jul 9, 2020 at 6:14 PM Petr Jelinek <petr@2ndquadrant.com> wrote:
>
> Hi,
>
> On 09/07/2020 14:34, Amit Kapila wrote:
> > On Thu, Jul 9, 2020 at 5:16 PM Petr Jelinek <petr@2ndquadrant.com> wrote:
> >>
> >> On 09/07/2020 13:10, Amit Kapila wrote:
> >>> On Thu, Feb 6, 2020 at 2:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>>>
> >>>> During logical decoding, we send replication_origin and
> >>>> replication_origin_lsn when we decode commit.  In pgoutput_begin_txn,
> >>>> we send values for these two but never used on the subscriber side.
> >>>> Though we have provided a function (logicalrep_read_origin) to read
> >>>> these two values but that is not used in code anywhere.
> >>>>
> >>
> >> We don't use the origin message anywhere really because we don't support
> >> origin forwarding in the built-in replication yet. That part I left out
> >> intentionally in the original PG10 patchset as it's mostly useful for
> >> circular replication detection when you want to replicate both ways.
> >> However that's relatively useless without also having some kind of
> >> conflict detection which would be another huge pile of code and I
> >> expected we would end up not getting logical replication in PG10 at all
> >> if I tried to push conflict detection as well :)
> >>
> >
> > Fair enough.  However, without tests and more documentation about this
> > concept, it is likely that future development might break it.  It is
> > good that you and others who know this part well are there to respond
> > but still, the more documentation and tests would be preferred.
> >
>
> Honestly that part didn't even need to be committed given it's unused.
> Protocol supports versioning so it could have been added at later time.
>
> >>>
> >>> For the purpose of decoding in-progress transactions, I think we can
> >>> send replication_origin in the first 'start' message as it is present
> >>> with each WAL record, however replication_origin_lsn is only logged at
> >>> commit time, so can't send it before commit.  The
> >>> replication_origin_lsn is set by pg_replication_origin_xact_setup()
> >>> but it is not clear how and when that function can be used.  Do we
> >>> really need replication_origin_lsn before we decode the commit record?
> >>>
> >>
> >> That's the SQL interface, C interface does not require that and I don't
> >> think we need to do that.
> >>
> >
> > I think when you are saying SQL interface, you referred to
> > pg_replication_origin_xact_setup() but I am not sure which C interface
> > you are referring to in the above sentence?
> >
>
> All the stuff pg_replication_origin_xact_setup does internally.
>
> >> The existing apply code sets the
> >> replorigin_session_origin_lsn only when processing commit message IIRC.
> >>
> >
> > That's correct.  However, we do send it via 'begin' callback which
> > won't be possible with the streaming of in-progress transactions.  Do
> > we need to send this origin related information (origin, origin_lsn)
> > while streaming of in-progress transactions?  If so, when?  As far as
> > I can see, the origin_id can be sent with the first 'start' message.
> > The origin_lsn and origin_commit can be sent with the last 'start' of
> > streaming commit if we want but not sure if that is of use.  If we
> > need to send origin_lsn earlier than that then we need to record it
> > with other WAL records (other than Commit WAL record).
> >
>
> If we were to support the origin forwarding, then strictly speaking we
> need everything only at commit time from correctness perspective,
>

Okay.  Anyway streaming mode is optional, so in such cases, we can keep it 'off'

> but
> ideally origin_id would be best sent with first message as it can be
> used to filter out changes at decoding stage rather than while we
> process the commit so having it set early improves performance of decoding.
>

Yeah, makes sense.  So, we will just send origin_id (with first
streaming start message) and leave others.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: Implement UNLOGGED clause for COPY FROM
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: Is this a bug in pg_current_logfile() on Windows?