RE: Skipping logical replication transactions on subscriber side

Поиск
Список
Период
Сортировка
От osumi.takamichi@fujitsu.com
Тема RE: Skipping logical replication transactions on subscriber side
Дата
Msg-id TYCPR01MB8373937465D9A994F4E9F981ED589@TYCPR01MB8373.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Monday, January 17, 2022 9:52 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Thank you for the comments!
..
> > (2) Minor improvement suggestion of comment in
> > src/backend/replication/logical/worker.c
> >
> > + * reset during that.  Also, we don't skip receiving the changes in
> > + streaming
> > + * cases, since we decide whether or not to skip applying the changes
> > + when
> >
> > I sugguest that you don't use 'streaming cases', because what
> > "streaming cases" means sounds a bit broader than actual your
> implementation.
> > We do skip transaction of streaming cases but not during the spooling phase,
> right ?
> >
> > I suggest below.
> >
> > "We don't skip receiving the changes at the phase to spool streaming
> transactions"
> 
> I might be missing your point but I think it's correct that we don't skip receiving
> the change of the transaction that is sent via streaming protocol. And it doesn't
> sound broader to me. Could you elaborate on that?
OK. Excuse me for lack of explanation.

I felt "streaming cases" implies "non-streaming cases"
to compare a diffference (in my head) when it is
used to explain something at first.
I imagined the contrast between those, when I saw it.

Thus, I thought "streaming cases" meant
whole flow of streaming transactions which consists of messages
surrounded by stream start and stream stop and which are finished by
stream commit/stream abort (including 2PC variations).

When I come back to the subject, you wrote below in the comment

"we don't skip receiving the changes in streaming cases,
since we decide whether or not to skip applying the changes
when starting to apply changes"

The first part of this sentence
("we don't skip receiving the changes in streaming cases")
gives me an impression where we don't skip changes in the streaming cases
(of my understanding above), but the last part
("we decide whether or not to skip applying the changes
when starting to apply change") means we skip transactions for streaming at apply phase.

So, this sentence looked confusing to me slightly.
Thus, I suggested below (and when I connect it with existing part)

"we don't skip receiving the changes at the phase to spool streaming transactions
since we decide whether or not to skip applying the changes when starting to apply changes"

For me this looked better, but of course, this is a suggestion.

> >
> > (3) in the comment of apply_handle_prepare_internal, two full-width
> characters.
> >
> > 3-1
> > +        * won’t be resent in a case where the server crashes between
> them.
> >
> > 3-2
> > +        * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay
> > + because this
> >
> > You have full-width characters for "won't" and "that's".
> > Could you please check ?
> 
> Which characters in "won't" are full-width characters? I could not find them.
All characters I found and mentioned as full-width are single quotes.

It might be good that you check the entire patch once
by some tool that helps you to detect it.

> > (5)
> >
> > I can miss something here but, in one of the past discussions, there
> > seems a consensus that if the user specifies XID of a subtransaction,
> > it would be better to skip only the subtransaction.
> >
> > This time, is it out of the range of the patch ?
> > If so, I suggest you include some description about it either in the
> > commit message or around codes related to it.
> 
> How can the user know subtransaction XID? I suppose you refer to streaming
> protocol cases but while applying spooled changes we don't report
> subtransaction XID neither in server log nor pg_stat_subscription_workers.
Yeah, usually subtransaction XID is not exposed to the users. I agree.

But, clarifying the target of this feature is only top-level transactions
sounds better to me. Thank you Amit-san for your support
about how we should write it in [1] !

> > (6)
> >
> > I feel it's a better idea to include a test whether to skip aborted
> > streaming transaction clears the XID in the TAP test for this feature,
> > in a sense to cover various new code paths. Did you have any special
> > reason to omit the case ?
> 
> Which code path is newly covered by this aborted streaming transaction tests?
> I think that this patch is already covered even by the test for a
> committed-and-streamed transaction. It doesn't matter whether the streamed
> transaction is committed or aborted because an error occurs while applying the
> spooled changes.
Oh, this was my mistake.  What I expressed as a new patch is
apply_handle_stream_abort -> clear_subscription_skip_xid.
But, this was totally wrong as you explained.


> >
> > (7)
> >
> > I want more explanation for the reason to restart the subscriber in
> > the TAP test because this is not mandatory operation.
> > (We can pass the TAP tests without this restart)
> >
> > From :
> > # Restart the subscriber node to restart logical replication with no
> > interval
> >
> > IIUC, below would be better.
> >
> > To :
> > # As an optimization to finish tests earlier, restart the subscriber
> > with no interval, # rather than waiting for new error to laucher a new apply
> worker.
> 
> I could not understand why the proposed sentence has more information.
> Does it mean you want to mention "As an optimization to finish tests earlier"?
Yes, exactly. The point is to add "As an optimization to finish tests earlier".

Probably, I should have asked a simple question "why do you restart the subscriber" ?
At first sight, I couldn't understand the meaning for the restart and
you don't explain the reason itself.

[1] - https://www.postgresql.org/message-id/CAA4eK1JHUF7fVNHQ1ZRRgVsdE8XDY8BruU9dNP3Q3jizNdpEbg%40mail.gmail.com


Best Regards,
    Takamichi Osumi


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Nancarrow
Дата:
Сообщение: Re: row filtering for logical replication
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: row filtering for logical replication