Обсуждение: Skipping logical replication transactions on subscriber side

Поиск

Список

Период

Сортировка

Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

24 мая 2021 г., 11:01:34

Hi all,

If a logical replication worker cannot apply the change on the
subscriber for some reason (e.g., missing table or violating a
constraint, etc.), logical replication stops until the problem is
resolved. Ideally, we resolve the problem on the subscriber (e.g., by
creating the missing table or removing the conflicting data, etc.) but
occasionally a problem cannot be fixed and it may be necessary to skip
the entire transaction in question. Currently, we have two ways to
skip transactions: advancing the LSN of the replication origin on the
subscriber and advancing the LSN of the replication slot on the
publisher. But both ways might not be able to skip exactly one
transaction in question and end up skipping other transactions too.

I’d like to propose a way to skip the particular transaction on the
subscriber side. As the first step, a transaction can be specified to
be skipped by specifying remote XID on the subscriber. This feature
would need two sub-features: (1) a sub-feature for users to identify
the problem subscription and the problem transaction’s XID, and (2) a
sub-feature to skip the particular transaction to apply.

For (1), I think the simplest way would be to put the details of the
change being applied in errcontext. For example, the following
errcontext shows the remote XID as well as the action name, the
relation name, and commit timestamp:

ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (c)=(1) already exists.
CONTEXT:  during apply of "INSERT" for relation "public.test" in
transaction with xid 590 commit timestamp 2021-05-21
14:32:02.134273+09

The user can identify which remote XID has a problem during applying
the change (XID=590 in this case). As another idea, we can have a
statistics view for logical replication workers, showing information
of the last failure transaction.

For (2), what I'm thinking is to add a new action to ALTER
SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
TRANSACTION 590. Also, we can have actions to reset it; ALTER
SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
XID to a new column of pg_subscription or a new catalog, having the
worker reread its subscription information. Once the worker skipped
the specified transaction, it resets the transaction to skip on the
catalog. The syntax allows users to specify one remote XID to skip. In
the future, it might be good if users can also specify multiple XIDs
(a range of XIDs or a list of XIDs, etc).

Feedback and comment are very welcome.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

24 мая 2021 г., 13:51:34

On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> If a logical replication worker cannot apply the change on the
> subscriber for some reason (e.g., missing table or violating a
> constraint, etc.), logical replication stops until the problem is
> resolved. Ideally, we resolve the problem on the subscriber (e.g., by
> creating the missing table or removing the conflicting data, etc.) but
> occasionally a problem cannot be fixed and it may be necessary to skip
> the entire transaction in question. Currently, we have two ways to
> skip transactions: advancing the LSN of the replication origin on the
> subscriber and advancing the LSN of the replication slot on the
> publisher. But both ways might not be able to skip exactly one
> transaction in question and end up skipping other transactions too.
>
> I’d like to propose a way to skip the particular transaction on the
> subscriber side. As the first step, a transaction can be specified to
> be skipped by specifying remote XID on the subscriber. This feature
> would need two sub-features: (1) a sub-feature for users to identify
> the problem subscription and the problem transaction’s XID, and (2) a
> sub-feature to skip the particular transaction to apply.
>
> For (1), I think the simplest way would be to put the details of the
> change being applied in errcontext. For example, the following
> errcontext shows the remote XID as well as the action name, the
> relation name, and commit timestamp:
>
> ERROR:  duplicate key value violates unique constraint "test_pkey"
> DETAIL:  Key (c)=(1) already exists.
> CONTEXT:  during apply of "INSERT" for relation "public.test" in
> transaction with xid 590 commit timestamp 2021-05-21
> 14:32:02.134273+09
>

In the above, the subscription name/id is not mentioned. I think you
need it for sub-feature-2.

> The user can identify which remote XID has a problem during applying
> the change (XID=590 in this case). As another idea, we can have a
> statistics view for logical replication workers, showing information
> of the last failure transaction.
>

It might be good to display at both places. Having subscriber-side
information in the view might be helpful in other ways as well like we
can use it to display the number of transactions processed by a
particular subscriber.

I think you need to consider few more things here:
(a) Say the error occurs after applying some part of changes, then
just skipping the remaining part won't be sufficient, we probably need
to someway rollback the applied changes (by rolling back the
transaction or in some other way).
(b) How do you handle streamed transactions? It is possible that some
of the streams are successful and the error occurs after that, say
when writing to the stream file. Now, would you skip writing to stream
file or will you write it, and then during apply, you will skip the
entire transaction and remove the corresponding stream file.
(c) There is also a possibility that the error occurs while applying
the changes of some subtransaction (this is only possible for
streaming xacts), so, in such cases, do we allow users to rollback the
subtransaction or user has to rollback the entire transaction. I am
not sure but maybe for very large transactions users might just want
to rollback the subtransaction.
(d) How about prepared transactions? Do we need to rollback the
prepared transaction if user decides to skip such a transaction? We
already allow prepared transactions to be streamed to plugins and the
work for subscriber-side apply is in progress [1], so I think we need
to consider this case as well.
(e) Do we want to provide such a feature via output plugins as well,
if not, why?

> For (2), what I'm thinking is to add a new action to ALTER
> SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
> TRANSACTION 590. Also, we can have actions to reset it; ALTER
> SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
> XID to a new column of pg_subscription or a new catalog, having the
> worker reread its subscription information. Once the worker skipped
> the specified transaction, it resets the transaction to skip on the
> catalog.
>

What if we fail while updating the reset information in the catalog?
Will it be the responsibility of the user to reset such a transaction
or we will retry it after restart of worker?  Now, say, we give such a
responsibility to the user and the user forgets to reset it then there
is a possibility that after wraparound we will again skip the
transaction which is not intended. And, if we want to retry it after
restart of worker, how will the worker remember the previous failure?

I think this will be a useful feature but we need to consider few more things.

[1] - https://www.postgresql.org/message-id/CAHut%2BPsDysQA%3DJWXb6oGFr1npvqi1e7RzzXV-juCCxnbiwHvfA%40mail.gmail.com

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Bharath Rupireddy

Дата:

25 мая 2021 г., 08:49:08

On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi all,
>
> If a logical replication worker cannot apply the change on the
> subscriber for some reason (e.g., missing table or violating a
> constraint, etc.), logical replication stops until the problem is
> resolved. Ideally, we resolve the problem on the subscriber (e.g., by
> creating the missing table or removing the conflicting data, etc.) but
> occasionally a problem cannot be fixed and it may be necessary to skip
> the entire transaction in question. Currently, we have two ways to
> skip transactions: advancing the LSN of the replication origin on the
> subscriber and advancing the LSN of the replication slot on the
> publisher. But both ways might not be able to skip exactly one
> transaction in question and end up skipping other transactions too.

Does it mean pg_replication_origin_advance() can't skip exactly one
txn? I'm not familiar with the function or never used it though, I was
just searching for "how to skip a single txn in postgres" and ended up
in [1]. Could you please give some more details on scenarios when we
can't skip exactly one txn? Is there any other way to advance the LSN,
something like directly updating the pg_replication_slots catalog?

[1] - https://www.postgresql.org/docs/devel/logical-replication-conflicts.html

> I’d like to propose a way to skip the particular transaction on the
> subscriber side. As the first step, a transaction can be specified to
> be skipped by specifying remote XID on the subscriber. This feature
> would need two sub-features: (1) a sub-feature for users to identify
> the problem subscription and the problem transaction’s XID, and (2) a
> sub-feature to skip the particular transaction to apply.
>
> For (1), I think the simplest way would be to put the details of the
> change being applied in errcontext. For example, the following
> errcontext shows the remote XID as well as the action name, the
> relation name, and commit timestamp:
>
> ERROR:  duplicate key value violates unique constraint "test_pkey"
> DETAIL:  Key (c)=(1) already exists.
> CONTEXT:  during apply of "INSERT" for relation "public.test" in
> transaction with xid 590 commit timestamp 2021-05-21
> 14:32:02.134273+09
>
> The user can identify which remote XID has a problem during applying
> the change (XID=590 in this case). As another idea, we can have a
> statistics view for logical replication workers, showing information
> of the last failure transaction.

Agree with Amit on this. At times, it is difficult to look around in
the server logs, so it will be better to have it in both places.

> For (2), what I'm thinking is to add a new action to ALTER
> SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
> TRANSACTION 590. Also, we can have actions to reset it; ALTER
> SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
> XID to a new column of pg_subscription or a new catalog, having the
> worker reread its subscription information. Once the worker skipped
> the specified transaction, it resets the transaction to skip on the
> catalog. The syntax allows users to specify one remote XID to skip. In
> the future, it might be good if users can also specify multiple XIDs
> (a range of XIDs or a list of XIDs, etc).

What's it like skipping a txn with txn id? Is it that the particular
txn is forced to commit or abort or just skipping some of the code in
the apply worker? IIUC, the behavior of RESET SKIP TRANSACTION is just
to forget the txn id specified in SET SKIP TRANSACTION right?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

25 мая 2021 г., 09:55:36

On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > If a logical replication worker cannot apply the change on the
> > subscriber for some reason (e.g., missing table or violating a
> > constraint, etc.), logical replication stops until the problem is
> > resolved. Ideally, we resolve the problem on the subscriber (e.g., by
> > creating the missing table or removing the conflicting data, etc.) but
> > occasionally a problem cannot be fixed and it may be necessary to skip
> > the entire transaction in question. Currently, we have two ways to
> > skip transactions: advancing the LSN of the replication origin on the
> > subscriber and advancing the LSN of the replication slot on the
> > publisher. But both ways might not be able to skip exactly one
> > transaction in question and end up skipping other transactions too.
> >
> > I’d like to propose a way to skip the particular transaction on the
> > subscriber side. As the first step, a transaction can be specified to
> > be skipped by specifying remote XID on the subscriber. This feature
> > would need two sub-features: (1) a sub-feature for users to identify
> > the problem subscription and the problem transaction’s XID, and (2) a
> > sub-feature to skip the particular transaction to apply.
> >
> > For (1), I think the simplest way would be to put the details of the
> > change being applied in errcontext. For example, the following
> > errcontext shows the remote XID as well as the action name, the
> > relation name, and commit timestamp:
> >
> > ERROR:  duplicate key value violates unique constraint "test_pkey"
> > DETAIL:  Key (c)=(1) already exists.
> > CONTEXT:  during apply of "INSERT" for relation "public.test" in
> > transaction with xid 590 commit timestamp 2021-05-21
> > 14:32:02.134273+09
> >
>
> In the above, the subscription name/id is not mentioned. I think you
> need it for sub-feature-2.

Agreed.

>
> > The user can identify which remote XID has a problem during applying
> > the change (XID=590 in this case). As another idea, we can have a
> > statistics view for logical replication workers, showing information
> > of the last failure transaction.
> >
>
> It might be good to display at both places. Having subscriber-side
> information in the view might be helpful in other ways as well like we
> can use it to display the number of transactions processed by a
> particular subscriber.

Yes. I think we can report that information to the stats collector. It
needs to live even after the worker exiting.

>
> I think you need to consider few more things here:
> (a) Say the error occurs after applying some part of changes, then
> just skipping the remaining part won't be sufficient, we probably need
> to someway rollback the applied changes (by rolling back the
> transaction or in some other way).

After more thought, it might be better to that setting and resetting
the XID to skip requires disabling the subscription. This would not be
a restriction for users since logical replication is likely to already
stop (and possibly repeating restarting and stopping) due to an error.
Setting and resetting the XID modifies the system catalog so it's a
crash-safe change and survives beyond the server restarts. When a
logical replication worker starts, it checks the XID. If the worker
receives changes associated with the transaction with the specified
XID, it can ignore the entire transaction.

> (b) How do you handle streamed transactions? It is possible that some
> of the streams are successful and the error occurs after that, say
> when writing to the stream file. Now, would you skip writing to stream
> file or will you write it, and then during apply, you will skip the
> entire transaction and remove the corresponding stream file.

I think streamed transactions can be handled in the same way described in (a).

> (c) There is also a possibility that the error occurs while applying
> the changes of some subtransaction (this is only possible for
> streaming xacts), so, in such cases, do we allow users to rollback the
> subtransaction or user has to rollback the entire transaction. I am
> not sure but maybe for very large transactions users might just want
> to rollback the subtransaction.

If the user specifies XID of a subtransaction, it would be better to
skip only the subtransaction. If specifies top transaction XID, it
would be better to skip the entire transaction. What do you think?

> (d) How about prepared transactions? Do we need to rollback the
> prepared transaction if user decides to skip such a transaction? We
> already allow prepared transactions to be streamed to plugins and the
> work for subscriber-side apply is in progress [1], so I think we need
> to consider this case as well.

If a transaction replicated from the subscriber could be prepared on
the subscriber, it would be guaranteed to be able to be either
committed or rolled back. Given that this feature is to skip a problem
transaction, I think it should not do anything for transactions that
are already prepared on the subscriber.

> (e) Do we want to provide such a feature via output plugins as well,
> if not, why?

You mean to specify an XID to skip on the publisher side? Since I've
been considering this feature as a way to resume the logical
replication having a problem I've not thought of that idea but It
would be a good idea. Do you have any use cases? If we specified the
XID on the publisher, multiple subscribers would skip that
transaction.

>
> > For (2), what I'm thinking is to add a new action to ALTER
> > SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
> > TRANSACTION 590. Also, we can have actions to reset it; ALTER
> > SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
> > XID to a new column of pg_subscription or a new catalog, having the
> > worker reread its subscription information. Once the worker skipped
> > the specified transaction, it resets the transaction to skip on the
> > catalog.
> >
>
> What if we fail while updating the reset information in the catalog?
> Will it be the responsibility of the user to reset such a transaction
> or we will retry it after restart of worker?  Now, say, we give such a
> responsibility to the user and the user forgets to reset it then there
> is a possibility that after wraparound we will again skip the
> transaction which is not intended. And, if we want to retry it after
> restart of worker, how will the worker remember the previous failure?

As described above, setting and resetting XID to skip is implemented
as a normal system catalog change, so it's crash-safe and persisted. I
think that the worker can either removes the XID or mark it as done
once it skipped the specified transaction so that it won't skip the
same XID again after wraparound. Also, it might be better if we reset
the XID also when a subscription field such as subconninfo is changed
because it could imply the worker will connect to another publisher
having a different XID space.

We also need to handle the cases where the user specifies an old XID
or XID whose transaction is already prepared on the subscriber. I
think the worker can reset the XID with a warning when it finds out
that the XID seems no longer valid or it cannot skip the specified
XID. For example in the former case, it can do that when the first
received transaction’s XID is newer than the specified XID. In the
latter case, it can do that when it receives the commit/rollback
prepared message of the specified XID.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

25 мая 2021 г., 11:13:37

On Tue, May 25, 2021 at 2:49 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Hi all,
> >
> > If a logical replication worker cannot apply the change on the
> > subscriber for some reason (e.g., missing table or violating a
> > constraint, etc.), logical replication stops until the problem is
> > resolved. Ideally, we resolve the problem on the subscriber (e.g., by
> > creating the missing table or removing the conflicting data, etc.) but
> > occasionally a problem cannot be fixed and it may be necessary to skip
> > the entire transaction in question. Currently, we have two ways to
> > skip transactions: advancing the LSN of the replication origin on the
> > subscriber and advancing the LSN of the replication slot on the
> > publisher. But both ways might not be able to skip exactly one
> > transaction in question and end up skipping other transactions too.
>
> Does it mean pg_replication_origin_advance() can't skip exactly one
> txn? I'm not familiar with the function or never used it though, I was
> just searching for "how to skip a single txn in postgres" and ended up
> in [1]. Could you please give some more details on scenarios when we
> can't skip exactly one txn? Is there any other way to advance the LSN,
> something like directly updating the pg_replication_slots catalog?

Sorry, it's not impossible. Although the user mistakenly skips more
than one transaction by specifying a wrong LSN it's always possible to
skip an exact one transaction.

>
> [1] - https://www.postgresql.org/docs/devel/logical-replication-conflicts.html
>
> > I’d like to propose a way to skip the particular transaction on the
> > subscriber side. As the first step, a transaction can be specified to
> > be skipped by specifying remote XID on the subscriber. This feature
> > would need two sub-features: (1) a sub-feature for users to identify
> > the problem subscription and the problem transaction’s XID, and (2) a
> > sub-feature to skip the particular transaction to apply.
> >
> > For (1), I think the simplest way would be to put the details of the
> > change being applied in errcontext. For example, the following
> > errcontext shows the remote XID as well as the action name, the
> > relation name, and commit timestamp:
> >
> > ERROR:  duplicate key value violates unique constraint "test_pkey"
> > DETAIL:  Key (c)=(1) already exists.
> > CONTEXT:  during apply of "INSERT" for relation "public.test" in
> > transaction with xid 590 commit timestamp 2021-05-21
> > 14:32:02.134273+09
> >
> > The user can identify which remote XID has a problem during applying
> > the change (XID=590 in this case). As another idea, we can have a
> > statistics view for logical replication workers, showing information
> > of the last failure transaction.
>
> Agree with Amit on this. At times, it is difficult to look around in
> the server logs, so it will be better to have it in both places.
>
> > For (2), what I'm thinking is to add a new action to ALTER
> > SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
> > TRANSACTION 590. Also, we can have actions to reset it; ALTER
> > SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
> > XID to a new column of pg_subscription or a new catalog, having the
> > worker reread its subscription information. Once the worker skipped
> > the specified transaction, it resets the transaction to skip on the
> > catalog. The syntax allows users to specify one remote XID to skip. In
> > the future, it might be good if users can also specify multiple XIDs
> > (a range of XIDs or a list of XIDs, etc).
>
> What's it like skipping a txn with txn id? Is it that the particular
> txn is forced to commit or abort or just skipping some of the code in
> the apply worker?

What I'm thinking is to ignore the entire transaction with the
specified XID. IOW Logical replication workers don't even start the
transaction and ignore all changes associated with the XID.

>  IIUC, the behavior of RESET SKIP TRANSACTION is just
> to forget the txn id specified in SET SKIP TRANSACTION right?

Right. I proposed this RESET command for users to cancel the skipping behavior.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Bharath Rupireddy

Дата:

25 мая 2021 г., 13:21:09

On Tue, May 25, 2021 at 1:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, May 25, 2021 at 2:49 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > If a logical replication worker cannot apply the change on the
> > > subscriber for some reason (e.g., missing table or violating a
> > > constraint, etc.), logical replication stops until the problem is
> > > resolved. Ideally, we resolve the problem on the subscriber (e.g., by
> > > creating the missing table or removing the conflicting data, etc.) but
> > > occasionally a problem cannot be fixed and it may be necessary to skip
> > > the entire transaction in question. Currently, we have two ways to
> > > skip transactions: advancing the LSN of the replication origin on the
> > > subscriber and advancing the LSN of the replication slot on the
> > > publisher. But both ways might not be able to skip exactly one
> > > transaction in question and end up skipping other transactions too.
> >
> > Does it mean pg_replication_origin_advance() can't skip exactly one
> > txn? I'm not familiar with the function or never used it though, I was
> > just searching for "how to skip a single txn in postgres" and ended up
> > in [1]. Could you please give some more details on scenarios when we
> > can't skip exactly one txn? Is there any other way to advance the LSN,
> > something like directly updating the pg_replication_slots catalog?
>
> Sorry, it's not impossible. Although the user mistakenly skips more
> than one transaction by specifying a wrong LSN it's always possible to
> skip an exact one transaction.

IIUC, if the user specifies the "correct LSN", then it's possible to
skip exact txn for which the sync workers are unable to apply changes,
right?

How can the user get the LSN (which we call "correct LSN")? Is it from
pg_replication_slots? Or some other way?

If the user somehow can get the "correct LSN", can't the exact txn be
skipped using it with any of the existing ways, either using
pg_replication_origin_advance or any other ways?

If there's no way to get the "correct LSN", then why can't we just
print that LSN in the error context and/or in the new statistics view
for logical replication workers, so that any of the existing ways can
be used to skip exactly one txn?

IIUC, the feature proposed here guards against the users specifying
wrong LSN. If I'm right, what is the guarantee that users don't
specify the wrong txn id? Why can't we tell the users when a wrong LSN
is specified that "currently, an apply worker is failing to apply the
LSN XXXX, and you specified LSN YYYY, are you sure this is
intentional?"

Please correct me if I'm missing anything.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

25 мая 2021 г., 15:41:27

On Tue, May 25, 2021 at 7:21 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, May 25, 2021 at 1:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, May 25, 2021 at 2:49 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > If a logical replication worker cannot apply the change on the
> > > > subscriber for some reason (e.g., missing table or violating a
> > > > constraint, etc.), logical replication stops until the problem is
> > > > resolved. Ideally, we resolve the problem on the subscriber (e.g., by
> > > > creating the missing table or removing the conflicting data, etc.) but
> > > > occasionally a problem cannot be fixed and it may be necessary to skip
> > > > the entire transaction in question. Currently, we have two ways to
> > > > skip transactions: advancing the LSN of the replication origin on the
> > > > subscriber and advancing the LSN of the replication slot on the
> > > > publisher. But both ways might not be able to skip exactly one
> > > > transaction in question and end up skipping other transactions too.
> > >
> > > Does it mean pg_replication_origin_advance() can't skip exactly one
> > > txn? I'm not familiar with the function or never used it though, I was
> > > just searching for "how to skip a single txn in postgres" and ended up
> > > in [1]. Could you please give some more details on scenarios when we
> > > can't skip exactly one txn? Is there any other way to advance the LSN,
> > > something like directly updating the pg_replication_slots catalog?
> >
> > Sorry, it's not impossible. Although the user mistakenly skips more
> > than one transaction by specifying a wrong LSN it's always possible to
> > skip an exact one transaction.
>
> IIUC, if the user specifies the "correct LSN", then it's possible to
> skip exact txn for which the sync workers are unable to apply changes,
> right?
>
> How can the user get the LSN (which we call "correct LSN")? Is it from
> pg_replication_slots? Or some other way?
>
> If the user somehow can get the "correct LSN", can't the exact txn be
> skipped using it with any of the existing ways, either using
> pg_replication_origin_advance or any other ways?

One possible way I know is to copy the logical replication slot used
by the subscriber and peek at the changes to identify the correct LSN
(maybe there is another handy way though) . For example, suppose that
two transactions insert tuples as follows on the publisher:

TX-A: BEGIN;
TX-A: INSERT INTO test VALUES (1);
TX-B: BEGIN;
TX-B: INSERT INTO test VALUES (10);
TX-B: COMMIT;
TX-A: INSERT INTO test VALUES (2);
TX-A: COMMIT;

And suppose further that the insertion with value = 10 (by TX-A)
cannot be applied only on the subscriber due to unique constraint
violation. If we copy the slot by
pg_copy_logical_replication_slot('test_sub', 'copy_slot', true,
'test_decoding') , we can peek at those changes with LSN as follows:

=# select * from pg_logical_slot_peek_changes('copy', null, null) order by lsn;
    lsn    | xid |                   data
-----------+-----+------------------------------------------
 0/1911548 | 736 | BEGIN 736
 0/1911548 | 736 | table public.hoge: INSERT: c[integer]:1
 0/1911588 | 737 | BEGIN 737
 0/1911588 | 737 | table public.hoge: INSERT: c[integer]:10
 0/19115F8 | 737 | COMMIT 737
 0/1911630 | 736 | table public.hoge: INSERT: c[integer]:2
 0/19116A0 | 736 | COMMIT 736
(7 rows)

In this case, '0/19115F8' is the correct LSN to specify. We can
advance the replication origin to ' 0/19115F8' by
pg_replication_origin_advance() so that logical replication streams
transactions committed after ' 0/19115F8'. After the logical
replication restarting, it skips the transaction with xid = 737 but
replicates the transaction with xid = 736.

> If there's no way to get the "correct LSN", then why can't we just
> print that LSN in the error context and/or in the new statistics view
> for logical replication workers, so that any of the existing ways can
> be used to skip exactly one txn?

I think specifying XID to the subscription is more understandable for users.

>
> IIUC, the feature proposed here guards against the users specifying
> wrong LSN. If I'm right, what is the guarantee that users don't
> specify the wrong txn id? Why can't we tell the users when a wrong LSN
> is specified that "currently, an apply worker is failing to apply the
> LSN XXXX, and you specified LSN YYYY, are you sure this is
> intentional?"

With the initial idea, specifying the correct XID is the user's
responsibility. If they specify an old XID, the worker invalids it and
raises a warning to tell "the worker invalidated the specified XID as
it's too old". As the second idea, if we store the last failed XID
somewhere (e.g., a system catalog), the user can just specify to skip
that transaction. That is, instead of specifying the XID they could do
something like "ALTER SUBSCRIPTION test_sub RESOLVE CONFLICT BY SKIP".

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

26 мая 2021 г., 09:43:43

On Tue, May 25, 2021 at 12:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I think you need to consider few more things here:
> > (a) Say the error occurs after applying some part of changes, then
> > just skipping the remaining part won't be sufficient, we probably need
> > to someway rollback the applied changes (by rolling back the
> > transaction or in some other way).
>
> After more thought, it might be better to that setting and resetting
> the XID to skip requires disabling the subscription.
>

It might be better if it doesn't require disabling the subscription
because it would be more steps for the user to disable/enable it. It
is not clear to me what exactly you want to gain by disabling the
subscription in this case.

> This would not be
> a restriction for users since logical replication is likely to already
> stop (and possibly repeating restarting and stopping) due to an error.
> Setting and resetting the XID modifies the system catalog so it's a
> crash-safe change and survives beyond the server restarts. When a
> logical replication worker starts, it checks the XID. If the worker
> receives changes associated with the transaction with the specified
> XID, it can ignore the entire transaction.
>
> > (b) How do you handle streamed transactions? It is possible that some
> > of the streams are successful and the error occurs after that, say
> > when writing to the stream file. Now, would you skip writing to stream
> > file or will you write it, and then during apply, you will skip the
> > entire transaction and remove the corresponding stream file.
>
> I think streamed transactions can be handled in the same way described in (a).
>
> > (c) There is also a possibility that the error occurs while applying
> > the changes of some subtransaction (this is only possible for
> > streaming xacts), so, in such cases, do we allow users to rollback the
> > subtransaction or user has to rollback the entire transaction. I am
> > not sure but maybe for very large transactions users might just want
> > to rollback the subtransaction.
>
> If the user specifies XID of a subtransaction, it would be better to
> skip only the subtransaction. If specifies top transaction XID, it
> would be better to skip the entire transaction. What do you think?
>

makes sense.

> > (d) How about prepared transactions? Do we need to rollback the
> > prepared transaction if user decides to skip such a transaction? We
> > already allow prepared transactions to be streamed to plugins and the
> > work for subscriber-side apply is in progress [1], so I think we need
> > to consider this case as well.
>
> If a transaction replicated from the subscriber could be prepared on
> the subscriber, it would be guaranteed to be able to be either
> committed or rolled back. Given that this feature is to skip a problem
> transaction, I think it should not do anything for transactions that
> are already prepared on the subscriber.
>

makes sense, but I think we need to reset the XID in such a case.

> > (e) Do we want to provide such a feature via output plugins as well,
> > if not, why?
>
> You mean to specify an XID to skip on the publisher side? Since I've
> been considering this feature as a way to resume the logical
> replication having a problem I've not thought of that idea but It
> would be a good idea. Do you have any use cases?
>

No. On again thinking about this, I think we can leave this for now.

> If we specified the
> XID on the publisher, multiple subscribers would skip that
> transaction.
>
> >
> > > For (2), what I'm thinking is to add a new action to ALTER
> > > SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
> > > TRANSACTION 590. Also, we can have actions to reset it; ALTER
> > > SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
> > > XID to a new column of pg_subscription or a new catalog, having the
> > > worker reread its subscription information. Once the worker skipped
> > > the specified transaction, it resets the transaction to skip on the
> > > catalog.
> > >
> >
> > What if we fail while updating the reset information in the catalog?
> > Will it be the responsibility of the user to reset such a transaction
> > or we will retry it after restart of worker?  Now, say, we give such a
> > responsibility to the user and the user forgets to reset it then there
> > is a possibility that after wraparound we will again skip the
> > transaction which is not intended. And, if we want to retry it after
> > restart of worker, how will the worker remember the previous failure?
>
> As described above, setting and resetting XID to skip is implemented
> as a normal system catalog change, so it's crash-safe and persisted. I
> think that the worker can either removes the XID or mark it as done
> once it skipped the specified transaction so that it won't skip the
> same XID again after wraparound.
>

It all depends on when exactly you want to update the catalog
information. Say after skipping commit of the XID, we do update the
corresponding LSN to be communicated as already processed to the
subscriber and then get the error while updating the catalog
information then next time we might not know whether to update the
catalog for skipped XID.

> Also, it might be better if we reset
> the XID also when a subscription field such as subconninfo is changed
> because it could imply the worker will connect to another publisher
> having a different XID space.
>
> We also need to handle the cases where the user specifies an old XID
> or XID whose transaction is already prepared on the subscriber. I
> think the worker can reset the XID with a warning when it finds out
> that the XID seems no longer valid or it cannot skip the specified
> XID. For example in the former case, it can do that when the first
> received transaction’s XID is newer than the specified XID.
>

But how can we guarantee that older XID can't be received later? Is
there a guarantee that we receive the transactions on subscriber in
XID order.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

26 мая 2021 г., 12:11:46

On Tue, May 25, 2021 at 6:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, May 25, 2021 at 7:21 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > If there's no way to get the "correct LSN", then why can't we just
> > print that LSN in the error context and/or in the new statistics view
> > for logical replication workers, so that any of the existing ways can
> > be used to skip exactly one txn?
>
> I think specifying XID to the subscription is more understandable for users.
>

I agree with you that specifying XID could be easier and
understandable for users. I was thinking and studying a bit about what
other systems do in this regard. Why don't we try to provide conflict
resolution methods for users? The idea could be that either the
conflicts can be resolved automatically or manually. In the case of
manual resolution, users can use the existing methods or the XID stuff
you are proposing here and in case of automatic resolution, the
in-built or corresponding user-defined functions will be invoked for
conflict resolution. There are more details to figure out in the
automatic resolution scheme but I see a lot of value in doing the
same.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

27 мая 2021 г., 07:25:54

On Wed, May 26, 2021 at 3:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, May 25, 2021 at 12:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I think you need to consider few more things here:
> > > (a) Say the error occurs after applying some part of changes, then
> > > just skipping the remaining part won't be sufficient, we probably need
> > > to someway rollback the applied changes (by rolling back the
> > > transaction or in some other way).
> >
> > After more thought, it might be better to that setting and resetting
> > the XID to skip requires disabling the subscription.
> >
>
> It might be better if it doesn't require disabling the subscription
> because it would be more steps for the user to disable/enable it. It
> is not clear to me what exactly you want to gain by disabling the
> subscription in this case.

The situation I’m considered is where the user specifies the XID while
the worker is applying the changes of the transaction with that XID.
In this case, I think we need to somehow rollback the changes applied
so far. Perhaps we can either rollback the transaction and ignore the
remaining changes or restart and ignore the entire transaction from
the beginning. Also, we need to handle the case where the user resets
the XID after the worker skips to write some stream files. I thought
those parts could be complicated but it might be not after more
thought.

>
> > This would not be
> > a restriction for users since logical replication is likely to already
> > stop (and possibly repeating restarting and stopping) due to an error.
> > Setting and resetting the XID modifies the system catalog so it's a
> > crash-safe change and survives beyond the server restarts. When a
> > logical replication worker starts, it checks the XID. If the worker
> > receives changes associated with the transaction with the specified
> > XID, it can ignore the entire transaction.
> >
> > > (b) How do you handle streamed transactions? It is possible that some
> > > of the streams are successful and the error occurs after that, say
> > > when writing to the stream file. Now, would you skip writing to stream
> > > file or will you write it, and then during apply, you will skip the
> > > entire transaction and remove the corresponding stream file.
> >
> > I think streamed transactions can be handled in the same way described in (a).

If setting and resetting the XID can be performed during the worker
running, we would need to write stream files even if we’re receiving
changes that are associated with the specified XID. Since it could
happen that the user resets the XID after we processed some of the
streamed changes, we would need to decide whether or to skip the
transaction when starting to apply changes.

> >
> > > (c) There is also a possibility that the error occurs while applying
> > > the changes of some subtransaction (this is only possible for
> > > streaming xacts), so, in such cases, do we allow users to rollback the
> > > subtransaction or user has to rollback the entire transaction. I am
> > > not sure but maybe for very large transactions users might just want
> > > to rollback the subtransaction.
> >
> > If the user specifies XID of a subtransaction, it would be better to
> > skip only the subtransaction. If specifies top transaction XID, it
> > would be better to skip the entire transaction. What do you think?
> >
>
> makes sense.
>
> > > (d) How about prepared transactions? Do we need to rollback the
> > > prepared transaction if user decides to skip such a transaction? We
> > > already allow prepared transactions to be streamed to plugins and the
> > > work for subscriber-side apply is in progress [1], so I think we need
> > > to consider this case as well.
> >
> > If a transaction replicated from the subscriber could be prepared on
> > the subscriber, it would be guaranteed to be able to be either
> > committed or rolled back. Given that this feature is to skip a problem
> > transaction, I think it should not do anything for transactions that
> > are already prepared on the subscriber.
> >
>
> makes sense, but I think we need to reset the XID in such a case.

Agreed.

>
> > > (e) Do we want to provide such a feature via output plugins as well,
> > > if not, why?
> >
> > You mean to specify an XID to skip on the publisher side? Since I've
> > been considering this feature as a way to resume the logical
> > replication having a problem I've not thought of that idea but It
> > would be a good idea. Do you have any use cases?
> >
>
> No. On again thinking about this, I think we can leave this for now.
>
> > If we specified the
> > XID on the publisher, multiple subscribers would skip that
> > transaction.
> >
> > >
> > > > For (2), what I'm thinking is to add a new action to ALTER
> > > > SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
> > > > TRANSACTION 590. Also, we can have actions to reset it; ALTER
> > > > SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
> > > > XID to a new column of pg_subscription or a new catalog, having the
> > > > worker reread its subscription information. Once the worker skipped
> > > > the specified transaction, it resets the transaction to skip on the
> > > > catalog.
> > > >
> > >
> > > What if we fail while updating the reset information in the catalog?
> > > Will it be the responsibility of the user to reset such a transaction
> > > or we will retry it after restart of worker?  Now, say, we give such a
> > > responsibility to the user and the user forgets to reset it then there
> > > is a possibility that after wraparound we will again skip the
> > > transaction which is not intended. And, if we want to retry it after
> > > restart of worker, how will the worker remember the previous failure?
> >
> > As described above, setting and resetting XID to skip is implemented
> > as a normal system catalog change, so it's crash-safe and persisted. I
> > think that the worker can either removes the XID or mark it as done
> > once it skipped the specified transaction so that it won't skip the
> > same XID again after wraparound.
> >
>
> It all depends on when exactly you want to update the catalog
> information. Say after skipping commit of the XID, we do update the
> corresponding LSN to be communicated as already processed to the
> subscriber and then get the error while updating the catalog
> information then next time we might not know whether to update the
> catalog for skipped XID.
>
> > Also, it might be better if we reset
> > the XID also when a subscription field such as subconninfo is changed
> > because it could imply the worker will connect to another publisher
> > having a different XID space.
> >
> > We also need to handle the cases where the user specifies an old XID
> > or XID whose transaction is already prepared on the subscriber. I
> > think the worker can reset the XID with a warning when it finds out
> > that the XID seems no longer valid or it cannot skip the specified
> > XID. For example in the former case, it can do that when the first
> > received transaction’s XID is newer than the specified XID.
> >
>
> But how can we guarantee that older XID can't be received later? Is
> there a guarantee that we receive the transactions on subscriber in
> XID order.

Considering the above two comments, it might be better to provide a
way to skip the transaction that is already known to be conflicted
rather than allowing users to specify the arbitrary XID.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

27 мая 2021 г., 08:48:02

On Thu, May 27, 2021 at 9:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, May 26, 2021 at 3:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, May 25, 2021 at 12:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I think you need to consider few more things here:
> > > > (a) Say the error occurs after applying some part of changes, then
> > > > just skipping the remaining part won't be sufficient, we probably need
> > > > to someway rollback the applied changes (by rolling back the
> > > > transaction or in some other way).
> > >
> > > After more thought, it might be better to that setting and resetting
> > > the XID to skip requires disabling the subscription.
> > >
> >
> > It might be better if it doesn't require disabling the subscription
> > because it would be more steps for the user to disable/enable it. It
> > is not clear to me what exactly you want to gain by disabling the
> > subscription in this case.
>
> The situation I’m considered is where the user specifies the XID while
> the worker is applying the changes of the transaction with that XID.
> In this case, I think we need to somehow rollback the changes applied
> so far. Perhaps we can either rollback the transaction and ignore the
> remaining changes or restart and ignore the entire transaction from
> the beginning.
>

If we follow your suggestion of only allowing XIDs that have been
known to have conflicts then probably we don't need to worry about
rollbacks.

> > > >
> > > > > For (2), what I'm thinking is to add a new action to ALTER
> > > > > SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
> > > > > TRANSACTION 590. Also, we can have actions to reset it; ALTER
> > > > > SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
> > > > > XID to a new column of pg_subscription or a new catalog, having the
> > > > > worker reread its subscription information. Once the worker skipped
> > > > > the specified transaction, it resets the transaction to skip on the
> > > > > catalog.
> > > > >
> > > >
> > > > What if we fail while updating the reset information in the catalog?
> > > > Will it be the responsibility of the user to reset such a transaction
> > > > or we will retry it after restart of worker?  Now, say, we give such a
> > > > responsibility to the user and the user forgets to reset it then there
> > > > is a possibility that after wraparound we will again skip the
> > > > transaction which is not intended. And, if we want to retry it after
> > > > restart of worker, how will the worker remember the previous failure?
> > >
> > > As described above, setting and resetting XID to skip is implemented
> > > as a normal system catalog change, so it's crash-safe and persisted. I
> > > think that the worker can either removes the XID or mark it as done
> > > once it skipped the specified transaction so that it won't skip the
> > > same XID again after wraparound.
> > >
> >
> > It all depends on when exactly you want to update the catalog
> > information. Say after skipping commit of the XID, we do update the
> > corresponding LSN to be communicated as already processed to the
> > subscriber and then get the error while updating the catalog
> > information then next time we might not know whether to update the
> > catalog for skipped XID.
> >
> > > Also, it might be better if we reset
> > > the XID also when a subscription field such as subconninfo is changed
> > > because it could imply the worker will connect to another publisher
> > > having a different XID space.
> > >
> > > We also need to handle the cases where the user specifies an old XID
> > > or XID whose transaction is already prepared on the subscriber. I
> > > think the worker can reset the XID with a warning when it finds out
> > > that the XID seems no longer valid or it cannot skip the specified
> > > XID. For example in the former case, it can do that when the first
> > > received transaction’s XID is newer than the specified XID.
> > >
> >
> > But how can we guarantee that older XID can't be received later? Is
> > there a guarantee that we receive the transactions on subscriber in
> > XID order.
>
> Considering the above two comments, it might be better to provide a
> way to skip the transaction that is already known to be conflicted
> rather than allowing users to specify the arbitrary XID.
>

Okay, that makes sense but still not sure how will you identify if we
need to reset XID in case of failure doing that in the previous
attempt. Also, I am thinking that instead of a stat view, do we need
to consider having a system table (pg_replication_conflicts or
something like that) for this because what if stats information is
lost (say either due to crash or due to udp packet loss), can we rely
on stats view for this?

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

27 мая 2021 г., 09:30:48

On Wed, May 26, 2021 at 6:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, May 25, 2021 at 6:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, May 25, 2021 at 7:21 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > If there's no way to get the "correct LSN", then why can't we just
> > > print that LSN in the error context and/or in the new statistics view
> > > for logical replication workers, so that any of the existing ways can
> > > be used to skip exactly one txn?
> >
> > I think specifying XID to the subscription is more understandable for users.
> >
>
> I agree with you that specifying XID could be easier and
> understandable for users. I was thinking and studying a bit about what
> other systems do in this regard. Why don't we try to provide conflict
> resolution methods for users? The idea could be that either the
> conflicts can be resolved automatically or manually. In the case of
> manual resolution, users can use the existing methods or the XID stuff
> you are proposing here and in case of automatic resolution, the
> in-built or corresponding user-defined functions will be invoked for
> conflict resolution. There are more details to figure out in the
> automatic resolution scheme but I see a lot of value in doing the
> same.

Yeah, I also see a lot of value in automatic conflict resolution. But
maybe we can have both ways? For example, in case where the user wants
to resolve conflicts in different ways or a conflict that cannot be
resolved by automatic resolution (not sure there is in practice
though), the manual resolution would also have value.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

27 мая 2021 г., 11:15:41

On Thu, May 27, 2021 at 2:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 27, 2021 at 9:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, May 26, 2021 at 3:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, May 25, 2021 at 12:26 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, May 24, 2021 at 7:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Mon, May 24, 2021 at 1:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > I think you need to consider few more things here:
> > > > > (a) Say the error occurs after applying some part of changes, then
> > > > > just skipping the remaining part won't be sufficient, we probably need
> > > > > to someway rollback the applied changes (by rolling back the
> > > > > transaction or in some other way).
> > > >
> > > > After more thought, it might be better to that setting and resetting
> > > > the XID to skip requires disabling the subscription.
> > > >
> > >
> > > It might be better if it doesn't require disabling the subscription
> > > because it would be more steps for the user to disable/enable it. It
> > > is not clear to me what exactly you want to gain by disabling the
> > > subscription in this case.
> >
> > The situation I’m considered is where the user specifies the XID while
> > the worker is applying the changes of the transaction with that XID.
> > In this case, I think we need to somehow rollback the changes applied
> > so far. Perhaps we can either rollback the transaction and ignore the
> > remaining changes or restart and ignore the entire transaction from
> > the beginning.
> >
>
> If we follow your suggestion of only allowing XIDs that have been
> known to have conflicts then probably we don't need to worry about
> rollbacks.
>
> > > > >
> > > > > > For (2), what I'm thinking is to add a new action to ALTER
> > > > > > SUBSCRIPTION command like ALTER SUBSCRIPTION test_sub SET SKIP
> > > > > > TRANSACTION 590. Also, we can have actions to reset it; ALTER
> > > > > > SUBSCRIPTION test_sub RESET SKIP TRANSACTION. Those commands add the
> > > > > > XID to a new column of pg_subscription or a new catalog, having the
> > > > > > worker reread its subscription information. Once the worker skipped
> > > > > > the specified transaction, it resets the transaction to skip on the
> > > > > > catalog.
> > > > > >
> > > > >
> > > > > What if we fail while updating the reset information in the catalog?
> > > > > Will it be the responsibility of the user to reset such a transaction
> > > > > or we will retry it after restart of worker?  Now, say, we give such a
> > > > > responsibility to the user and the user forgets to reset it then there
> > > > > is a possibility that after wraparound we will again skip the
> > > > > transaction which is not intended. And, if we want to retry it after
> > > > > restart of worker, how will the worker remember the previous failure?
> > > >
> > > > As described above, setting and resetting XID to skip is implemented
> > > > as a normal system catalog change, so it's crash-safe and persisted. I
> > > > think that the worker can either removes the XID or mark it as done
> > > > once it skipped the specified transaction so that it won't skip the
> > > > same XID again after wraparound.
> > > >
> > >
> > > It all depends on when exactly you want to update the catalog
> > > information. Say after skipping commit of the XID, we do update the
> > > corresponding LSN to be communicated as already processed to the
> > > subscriber and then get the error while updating the catalog
> > > information then next time we might not know whether to update the
> > > catalog for skipped XID.
> > >
> > > > Also, it might be better if we reset
> > > > the XID also when a subscription field such as subconninfo is changed
> > > > because it could imply the worker will connect to another publisher
> > > > having a different XID space.
> > > >
> > > > We also need to handle the cases where the user specifies an old XID
> > > > or XID whose transaction is already prepared on the subscriber. I
> > > > think the worker can reset the XID with a warning when it finds out
> > > > that the XID seems no longer valid or it cannot skip the specified
> > > > XID. For example in the former case, it can do that when the first
> > > > received transaction’s XID is newer than the specified XID.
> > > >
> > >
> > > But how can we guarantee that older XID can't be received later? Is
> > > there a guarantee that we receive the transactions on subscriber in
> > > XID order.
> >
> > Considering the above two comments, it might be better to provide a
> > way to skip the transaction that is already known to be conflicted
> > rather than allowing users to specify the arbitrary XID.
> >
>
> Okay, that makes sense but still not sure how will you identify if we
> need to reset XID in case of failure doing that in the previous
> attempt.

It's a just idea but we can record the failed transaction with XID as
well as its commit LSN passed? The sequence I'm thinking is,

1. the worker records the XID and commit LSN of the failed transaction
to a catalog.
2. the user specifies how to resolve that conflict transaction
(currently only 'skip' is supported) and writes to the catalog.
3. the worker does the resolution method according to the catalog. If
the worker didn't start to apply those changes, it can skip the entire
transaction. If did, it rollbacks the transaction and ignores the
remaining.

The worker needs neither to reset information of the last failed
transaction nor to mark the conflicted transaction as resolved. The
worker will ignore that information when checking the catalog if the
commit LSN is passed.

> Also, I am thinking that instead of a stat view, do we need
> to consider having a system table (pg_replication_conflicts or
> something like that) for this because what if stats information is
> lost (say either due to crash or due to udp packet loss), can we rely
> on stats view for this?

Yeah, it seems better to use a catalog.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

27 мая 2021 г., 13:04:37

On Thu, May 27, 2021 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, May 27, 2021 at 2:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Okay, that makes sense but still not sure how will you identify if we
> > need to reset XID in case of failure doing that in the previous
> > attempt.
>
> It's a just idea but we can record the failed transaction with XID as
> well as its commit LSN passed? The sequence I'm thinking is,
>
> 1. the worker records the XID and commit LSN of the failed transaction
> to a catalog.
>

When will you record this info? I am not sure if we can try to update
this when an error has occurred. We can think of using try..catch in
apply worker and then record it in catch on error but would that be
advisable? One random thought that occurred to me is to that apply
worker notifies such information to the launcher (or maybe another
process) which will log this information.

> 2. the user specifies how to resolve that conflict transaction
> (currently only 'skip' is supported) and writes to the catalog.
> 3. the worker does the resolution method according to the catalog. If
> the worker didn't start to apply those changes, it can skip the entire
> transaction. If did, it rollbacks the transaction and ignores the
> remaining.
>
> The worker needs neither to reset information of the last failed
> transaction nor to mark the conflicted transaction as resolved. The
> worker will ignore that information when checking the catalog if the
> commit LSN is passed.
>

So won't this require us to check the required info in the catalog
before applying each transaction? If so, that might be overhead, maybe
we can build some cache of the highest commitLSN that can be consulted
rather than the catalog table. I think we need to think about when to
remove rows for which conflict has been resolved as we can't let that
information grow infinitely.

> > Also, I am thinking that instead of a stat view, do we need
> > to consider having a system table (pg_replication_conflicts or
> > something like that) for this because what if stats information is
> > lost (say either due to crash or due to udp packet loss), can we rely
> > on stats view for this?
>
> Yeah, it seems better to use a catalog.
>

Okay.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

27 мая 2021 г., 13:26:33

On Thu, May 27, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, May 26, 2021 at 6:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I agree with you that specifying XID could be easier and
> > understandable for users. I was thinking and studying a bit about what
> > other systems do in this regard. Why don't we try to provide conflict
> > resolution methods for users? The idea could be that either the
> > conflicts can be resolved automatically or manually. In the case of
> > manual resolution, users can use the existing methods or the XID stuff
> > you are proposing here and in case of automatic resolution, the
> > in-built or corresponding user-defined functions will be invoked for
> > conflict resolution. There are more details to figure out in the
> > automatic resolution scheme but I see a lot of value in doing the
> > same.
>
> Yeah, I also see a lot of value in automatic conflict resolution. But
> maybe we can have both ways? For example, in case where the user wants
> to resolve conflicts in different ways or a conflict that cannot be
> resolved by automatic resolution (not sure there is in practice
> though), the manual resolution would also have value.
>

Right, that is exactly what I was saying. So, even if both can be done
as separate patches, we should try to design the manual resolution in
a way that can be extended for an automatic resolution system. I think
we can try to have some initial idea/design/POC for an automatic
resolution as well to ensure that the manual resolution scheme can be
further extended.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

29 мая 2021 г., 05:32:04

On Thu, May 27, 2021 at 7:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 27, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, May 26, 2021 at 6:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I agree with you that specifying XID could be easier and
> > > understandable for users. I was thinking and studying a bit about what
> > > other systems do in this regard. Why don't we try to provide conflict
> > > resolution methods for users? The idea could be that either the
> > > conflicts can be resolved automatically or manually. In the case of
> > > manual resolution, users can use the existing methods or the XID stuff
> > > you are proposing here and in case of automatic resolution, the
> > > in-built or corresponding user-defined functions will be invoked for
> > > conflict resolution. There are more details to figure out in the
> > > automatic resolution scheme but I see a lot of value in doing the
> > > same.
> >
> > Yeah, I also see a lot of value in automatic conflict resolution. But
> > maybe we can have both ways? For example, in case where the user wants
> > to resolve conflicts in different ways or a conflict that cannot be
> > resolved by automatic resolution (not sure there is in practice
> > though), the manual resolution would also have value.
> >
>
> Right, that is exactly what I was saying. So, even if both can be done
> as separate patches, we should try to design the manual resolution in
> a way that can be extended for an automatic resolution system. I think
> we can try to have some initial idea/design/POC for an automatic
> resolution as well to ensure that the manual resolution scheme can be
> further extended.

Totally agreed.

But perhaps we might want to note that the conflict resolution we're
talking about is to resolve conflicts at the row or column level. It
doesn't necessarily raise an ERROR and the granularity of resolution
is per record or column. For example, if a DELETE and an UPDATE
process the same tuple (searched by PK), the UPDATE may not find the
tuple and be ignored due to the tuple having been already deleted. In
this case, no ERROR will occur (i.g. UPDATE will be ignored), but the
user may want to do another conflict resolution. On the other hand,
the feature proposed here assumes that an error has already occurred
and logical replication has already been stopped. And resolves it by
skipping the entire transaction.

IIUC the conflict resolution can be thought of as a combination of
types of conflicts and the resolution that can be applied to them. For
example, if there is a conflict between INSERT and INSERT and the
latter INSERT violates the unique constraint, an ERROR is raised. So
DBA can resolve it manually. But there is another way to automatically
resolve it by selecting the tuple having a newer timestamp. On the
other hand, in the DELETE and UPDATE conflict described above, it's
possible to automatically ignore the fact that the UPDATE could update
the tuple. Or we can even generate an ERROR so that DBA can resolve it
manually. DBA can manually resolve the conflict in various ways:
skipping the entire transaction from the origin, choose the tuple
having a newer/older timestamp, etc.

In that sense, we can think of the feature proposed here as a feature
that provides a way to resolve the conflict that would originally
cause an ERROR by skipping the entire transaction. If we add a
solution that raises an ERROR for conflicts that don't originally
raise an ERROR (like DELETE and UPDATE conflict) in the future, we
will be able to manually skip each transaction for all types of
conflicts.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

29 мая 2021 г., 05:56:39

On Thu, May 27, 2021 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 27, 2021 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, May 27, 2021 at 2:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > Okay, that makes sense but still not sure how will you identify if we
> > > need to reset XID in case of failure doing that in the previous
> > > attempt.
> >
> > It's a just idea but we can record the failed transaction with XID as
> > well as its commit LSN passed? The sequence I'm thinking is,
> >
> > 1. the worker records the XID and commit LSN of the failed transaction
> > to a catalog.
> >
>
> When will you record this info? I am not sure if we can try to update
> this when an error has occurred. We can think of using try..catch in
> apply worker and then record it in catch on error but would that be
> advisable? One random thought that occurred to me is to that apply
> worker notifies such information to the launcher (or maybe another
> process) which will log this information.

Yeah, I was concerned about that too and had the same idea. The
information still could not be written if the server crashes before
the launcher writes it. But I think it's an acceptable.

>
> > 2. the user specifies how to resolve that conflict transaction
> > (currently only 'skip' is supported) and writes to the catalog.
> > 3. the worker does the resolution method according to the catalog. If
> > the worker didn't start to apply those changes, it can skip the entire
> > transaction. If did, it rollbacks the transaction and ignores the
> > remaining.
> >
> > The worker needs neither to reset information of the last failed
> > transaction nor to mark the conflicted transaction as resolved. The
> > worker will ignore that information when checking the catalog if the
> > commit LSN is passed.
> >
>
> So won't this require us to check the required info in the catalog
> before applying each transaction? If so, that might be overhead, maybe
> we can build some cache of the highest commitLSN that can be consulted
> rather than the catalog table.

I think workers can cache that information when starts and invalidates
and reload the cache when the catalog gets updated.  Specifying to
skip XID will update the catalog, invalidating the cache.

> I think we need to think about when to
> remove rows for which conflict has been resolved as we can't let that
> information grow infinitely.

I guess we can update catalog tuples in place when another conflict
happens next time. The catalog tuple should be fixed size. The
already-resolved conflict will have the commit LSN older than its
replication origin's LSN.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

29 мая 2021 г., 09:54:16

On Sat, May 29, 2021 at 8:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, May 27, 2021 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, May 27, 2021 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > 1. the worker records the XID and commit LSN of the failed transaction
> > > to a catalog.
> > >
> >
> > When will you record this info? I am not sure if we can try to update
> > this when an error has occurred. We can think of using try..catch in
> > apply worker and then record it in catch on error but would that be
> > advisable? One random thought that occurred to me is to that apply
> > worker notifies such information to the launcher (or maybe another
> > process) which will log this information.
>
> Yeah, I was concerned about that too and had the same idea. The
> information still could not be written if the server crashes before
> the launcher writes it. But I think it's an acceptable.
>

True, because even if the launcher restarts, the apply worker will
error out again and resend the information. I guess we can have an
error queue where apply workers can add their information and the
launcher will then process those.  If we do that, then we need to
probably define what we want to do if the queue gets full, either
apply worker nudge launcher and wait or it can just throw an error and
continue. If you have any better ideas to share this information then
we can consider those as well.

> >
> > > 2. the user specifies how to resolve that conflict transaction
> > > (currently only 'skip' is supported) and writes to the catalog.
> > > 3. the worker does the resolution method according to the catalog. If
> > > the worker didn't start to apply those changes, it can skip the entire
> > > transaction. If did, it rollbacks the transaction and ignores the
> > > remaining.
> > >
> > > The worker needs neither to reset information of the last failed
> > > transaction nor to mark the conflicted transaction as resolved. The
> > > worker will ignore that information when checking the catalog if the
> > > commit LSN is passed.
> > >
> >
> > So won't this require us to check the required info in the catalog
> > before applying each transaction? If so, that might be overhead, maybe
> > we can build some cache of the highest commitLSN that can be consulted
> > rather than the catalog table.
>
> I think workers can cache that information when starts and invalidates
> and reload the cache when the catalog gets updated.  Specifying to
> skip XID will update the catalog, invalidating the cache.
>
> > I think we need to think about when to
> > remove rows for which conflict has been resolved as we can't let that
> > information grow infinitely.
>
> I guess we can update catalog tuples in place when another conflict
> happens next time. The catalog tuple should be fixed size. The
> already-resolved conflict will have the commit LSN older than its
> replication origin's LSN.
>

Okay, but I have a slight concern that we will keep xid in the system
which might have been no longer valid. So, we will keep this info
about subscribers around till one performs drop subscription,
hopefully, that doesn't lead to too many rows. This will be okay as
per the current design but say tomorrow we decide to parallelize the
apply for a subscription then there could be multiple errors
corresponding to a subscription and in that case, such a design might
appear quite limiting. One possibility could be that when the launcher
is periodically checking for new error messages, it can clean up the
conflicts catalog as well, or maybe autovacuum does this periodically
as it does for stats (via pgstat_vacuum_stat).

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

31 мая 2021 г., 10:09:19

On Sat, May 29, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, May 29, 2021 at 8:27 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, May 27, 2021 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, May 27, 2021 at 1:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > 1. the worker records the XID and commit LSN of the failed transaction
> > > > to a catalog.
> > > >
> > >
> > > When will you record this info? I am not sure if we can try to update
> > > this when an error has occurred. We can think of using try..catch in
> > > apply worker and then record it in catch on error but would that be
> > > advisable? One random thought that occurred to me is to that apply
> > > worker notifies such information to the launcher (or maybe another
> > > process) which will log this information.
> >
> > Yeah, I was concerned about that too and had the same idea. The
> > information still could not be written if the server crashes before
> > the launcher writes it. But I think it's an acceptable.
> >
>
> True, because even if the launcher restarts, the apply worker will
> error out again and resend the information. I guess we can have an
> error queue where apply workers can add their information and the
> launcher will then process those.  If we do that, then we need to
> probably define what we want to do if the queue gets full, either
> apply worker nudge launcher and wait or it can just throw an error and
> continue. If you have any better ideas to share this information then
> we can consider those as well.

+1 for using error queue. Maybe we need to avoid queuing the same
error more than once to avoid the catalog from being updated
frequently?

>
> > >
> > > > 2. the user specifies how to resolve that conflict transaction
> > > > (currently only 'skip' is supported) and writes to the catalog.
> > > > 3. the worker does the resolution method according to the catalog. If
> > > > the worker didn't start to apply those changes, it can skip the entire
> > > > transaction. If did, it rollbacks the transaction and ignores the
> > > > remaining.
> > > >
> > > > The worker needs neither to reset information of the last failed
> > > > transaction nor to mark the conflicted transaction as resolved. The
> > > > worker will ignore that information when checking the catalog if the
> > > > commit LSN is passed.
> > > >
> > >
> > > So won't this require us to check the required info in the catalog
> > > before applying each transaction? If so, that might be overhead, maybe
> > > we can build some cache of the highest commitLSN that can be consulted
> > > rather than the catalog table.
> >
> > I think workers can cache that information when starts and invalidates
> > and reload the cache when the catalog gets updated.  Specifying to
> > skip XID will update the catalog, invalidating the cache.
> >
> > > I think we need to think about when to
> > > remove rows for which conflict has been resolved as we can't let that
> > > information grow infinitely.
> >
> > I guess we can update catalog tuples in place when another conflict
> > happens next time. The catalog tuple should be fixed size. The
> > already-resolved conflict will have the commit LSN older than its
> > replication origin's LSN.
> >
>
> Okay, but I have a slight concern that we will keep xid in the system
> which might have been no longer valid. So, we will keep this info
> about subscribers around till one performs drop subscription,
> hopefully, that doesn't lead to too many rows. This will be okay as
> per the current design but say tomorrow we decide to parallelize the
> apply for a subscription then there could be multiple errors
> corresponding to a subscription and in that case, such a design might
> appear quite limiting. One possibility could be that when the launcher
> is periodically checking for new error messages, it can clean up the
> conflicts catalog as well, or maybe autovacuum does this periodically
> as it does for stats (via pgstat_vacuum_stat).

Yeah, it's better to have a way to cleanup no longer valid entries in
the catalog in the case where the worker failed to remove it. I prefer
the former idea so far, so I'll implement it in a PoC patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

31 мая 2021 г., 14:40:50

On Mon, May 31, 2021 at 12:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, May 29, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > > > 1. the worker records the XID and commit LSN of the failed transaction
> > > > > to a catalog.
> > > > >
> > > >
> > > > When will you record this info? I am not sure if we can try to update
> > > > this when an error has occurred. We can think of using try..catch in
> > > > apply worker and then record it in catch on error but would that be
> > > > advisable? One random thought that occurred to me is to that apply
> > > > worker notifies such information to the launcher (or maybe another
> > > > process) which will log this information.
> > >
> > > Yeah, I was concerned about that too and had the same idea. The
> > > information still could not be written if the server crashes before
> > > the launcher writes it. But I think it's an acceptable.
> > >
> >
> > True, because even if the launcher restarts, the apply worker will
> > error out again and resend the information. I guess we can have an
> > error queue where apply workers can add their information and the
> > launcher will then process those.  If we do that, then we need to
> > probably define what we want to do if the queue gets full, either
> > apply worker nudge launcher and wait or it can just throw an error and
> > continue. If you have any better ideas to share this information then
> > we can consider those as well.
>
> +1 for using error queue. Maybe we need to avoid queuing the same
> error more than once to avoid the catalog from being updated
> frequently?
>

Yes, I think it is important because after logging the subscription
may still error again unless the user does something to skip or
resolve the conflict. I guess you need to check for the existence of
error in systable and or in the queue.

> >
> > >
> > > I guess we can update catalog tuples in place when another conflict
> > > happens next time. The catalog tuple should be fixed size. The
> > > already-resolved conflict will have the commit LSN older than its
> > > replication origin's LSN.
> > >
> >
> > Okay, but I have a slight concern that we will keep xid in the system
> > which might have been no longer valid. So, we will keep this info
> > about subscribers around till one performs drop subscription,
> > hopefully, that doesn't lead to too many rows. This will be okay as
> > per the current design but say tomorrow we decide to parallelize the
> > apply for a subscription then there could be multiple errors
> > corresponding to a subscription and in that case, such a design might
> > appear quite limiting. One possibility could be that when the launcher
> > is periodically checking for new error messages, it can clean up the
> > conflicts catalog as well, or maybe autovacuum does this periodically
> > as it does for stats (via pgstat_vacuum_stat).
>
> Yeah, it's better to have a way to cleanup no longer valid entries in
> the catalog in the case where the worker failed to remove it. I prefer
> the former idea so far,
>

Which idea do you refer to here as former (cleaning up by launcher)?

> so I'll implement it in a PoC patch.
>

Okay.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

31 мая 2021 г., 22:25:55

On 27.05.21 12:04, Amit Kapila wrote:
>>> Also, I am thinking that instead of a stat view, do we need
>>> to consider having a system table (pg_replication_conflicts or
>>> something like that) for this because what if stats information is
>>> lost (say either due to crash or due to udp packet loss), can we rely
>>> on stats view for this?
>> Yeah, it seems better to use a catalog.
>>
> Okay.

Could you store it shared memory?  You don't need it to be crash safe, 
since the subscription will just run into the same error again after 
restart.  You just don't want it to be lost, like with the statistics 
collector.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

01 июня 2021 г., 07:01:33

On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 27.05.21 12:04, Amit Kapila wrote:
> >>> Also, I am thinking that instead of a stat view, do we need
> >>> to consider having a system table (pg_replication_conflicts or
> >>> something like that) for this because what if stats information is
> >>> lost (say either due to crash or due to udp packet loss), can we rely
> >>> on stats view for this?
> >> Yeah, it seems better to use a catalog.
> >>
> > Okay.
>
> Could you store it shared memory?  You don't need it to be crash safe,
> since the subscription will just run into the same error again after
> restart.  You just don't want it to be lost, like with the statistics
> collector.
>

But, won't that be costly in cases where we have errors in the
processing of very large transactions? Subscription has to process all
the data before it gets an error. I think we can even imagine this
feature to be extended to use commitLSN as a skip candidate in which
case we can even avoid getting the data of that transaction from the
publisher. So if this information is persistent, the user can even set
the skip identifier after the restart before the publisher can send
all the data.

Also, I think we can't assume after the restart we will get the same
error because the user can perform some operations after the restart
and before we try to apply the same transaction. It might be that the
user wanted to see all the errors before the user can set the skip
identifier (and or method).

I think the XID (or say another identifier like commitLSN) which we
want to use for skipping the transaction as specified by the user has
to be stored in the catalog because otherwise, after the restart we
won't remember it and the user won't know that he needs to set it
again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
..), isn't it better to store all conflict-related information in a
separate catalog like pg_subscription_conflict or something like that.
I think it might be also better to later extend it for auto conflict
resolution where the user can specify auto conflict resolution info
for a subscription. Is it better to store all such information in
pg_subscription or have a separate catalog? It is possible that even
if we have a separate catalog for conflict info, we might not want to
store error info there.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

01 июня 2021 г., 07:37:22

On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> >
> > On 27.05.21 12:04, Amit Kapila wrote:
> > >>> Also, I am thinking that instead of a stat view, do we need
> > >>> to consider having a system table (pg_replication_conflicts or
> > >>> something like that) for this because what if stats information is
> > >>> lost (say either due to crash or due to udp packet loss), can we rely
> > >>> on stats view for this?
> > >> Yeah, it seems better to use a catalog.
> > >>
> > > Okay.
> >
> > Could you store it shared memory?  You don't need it to be crash safe,
> > since the subscription will just run into the same error again after
> > restart.  You just don't want it to be lost, like with the statistics
> > collector.
> >
>
> But, won't that be costly in cases where we have errors in the
> processing of very large transactions? Subscription has to process all
> the data before it gets an error.

I had the same concern. Particularly, the approach we currently
discussed is to skip the transaction based on the information written
by the worker rather than require the user to specify the XID.
Therefore, we will always require the worker to process the same large
transaction after the restart in order to skip the transaction.

> I think we can even imagine this
> feature to be extended to use commitLSN as a skip candidate in which
> case we can even avoid getting the data of that transaction from the
> publisher. So if this information is persistent, the user can even set
> the skip identifier after the restart before the publisher can send
> all the data.

Another possible benefit of writing it to a catalog is that we can
replicate it to the physical standbys. If we have failover slots in
the future, the physical standby server also can resolve the conflict
without processing a possibly large transaction.

> I think the XID (or say another identifier like commitLSN) which we
> want to use for skipping the transaction as specified by the user has
> to be stored in the catalog because otherwise, after the restart we
> won't remember it and the user won't know that he needs to set it
> again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
> ..), isn't it better to store all conflict-related information in a
> separate catalog like pg_subscription_conflict or something like that.
> I think it might be also better to later extend it for auto conflict
> resolution where the user can specify auto conflict resolution info
> for a subscription. Is it better to store all such information in
> pg_subscription or have a separate catalog? It is possible that even
> if we have a separate catalog for conflict info, we might not want to
> store error info there.

Just to be clear, we need to store only the conflict-related
information that cannot be resolved without manual intervention,
right? That is, conflicts cause an error, exiting the workers. In
general, replication conflicts include also conflicts that don’t cause
an error. I think that those conflicts don’t necessarily need to be
stored in the catalog and don’t require manual intervention.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

01 июня 2021 г., 08:28:10

On Tue, Jun 1, 2021 at 10:07 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
> > <peter.eisentraut@enterprisedb.com> wrote:
> > >
> > > On 27.05.21 12:04, Amit Kapila wrote:
> > > >>> Also, I am thinking that instead of a stat view, do we need
> > > >>> to consider having a system table (pg_replication_conflicts or
> > > >>> something like that) for this because what if stats information is
> > > >>> lost (say either due to crash or due to udp packet loss), can we rely
> > > >>> on stats view for this?
> > > >> Yeah, it seems better to use a catalog.
> > > >>
> > > > Okay.
> > >
> > > Could you store it shared memory?  You don't need it to be crash safe,
> > > since the subscription will just run into the same error again after
> > > restart.  You just don't want it to be lost, like with the statistics
> > > collector.
> > >
> >
> > But, won't that be costly in cases where we have errors in the
> > processing of very large transactions? Subscription has to process all
> > the data before it gets an error.
>
> I had the same concern. Particularly, the approach we currently
> discussed is to skip the transaction based on the information written
> by the worker rather than require the user to specify the XID.
>

Yeah, but I was imagining that the user still needs to specify
something to indicate that we need to skip it, otherwise, we might try
to skip a transaction that the user wants to resolve by itself rather
than expecting us to skip it. Another point is if we don't store this
information in a persistent way then how will we restrict a user to
specify some random XID which is not even errored after restart.

> Therefore, we will always require the worker to process the same large
> transaction after the restart in order to skip the transaction.
>
> > I think we can even imagine this
> > feature to be extended to use commitLSN as a skip candidate in which
> > case we can even avoid getting the data of that transaction from the
> > publisher. So if this information is persistent, the user can even set
> > the skip identifier after the restart before the publisher can send
> > all the data.
>
> Another possible benefit of writing it to a catalog is that we can
> replicate it to the physical standbys. If we have failover slots in
> the future, the physical standby server also can resolve the conflict
> without processing a possibly large transaction.
>

makes sense.

> > I think the XID (or say another identifier like commitLSN) which we
> > want to use for skipping the transaction as specified by the user has
> > to be stored in the catalog because otherwise, after the restart we
> > won't remember it and the user won't know that he needs to set it
> > again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
> > ..), isn't it better to store all conflict-related information in a
> > separate catalog like pg_subscription_conflict or something like that.
> > I think it might be also better to later extend it for auto conflict
> > resolution where the user can specify auto conflict resolution info
> > for a subscription. Is it better to store all such information in
> > pg_subscription or have a separate catalog? It is possible that even
> > if we have a separate catalog for conflict info, we might not want to
> > store error info there.
>
> Just to be clear, we need to store only the conflict-related
> information that cannot be resolved without manual intervention,
> right? That is, conflicts cause an error, exiting the workers. In
> general, replication conflicts include also conflicts that don’t cause
> an error. I think that those conflicts don’t necessarily need to be
> stored in the catalog and don’t require manual intervention.
>

Yeah, I think we want to record the error cases but which other
conflicts you are talking about here which doesn't lead to any sort of
error?

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

01 июня 2021 г., 10:53:41

On Tue, Jun 1, 2021 at 2:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 10:07 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
> > > <peter.eisentraut@enterprisedb.com> wrote:
> > > >
> > > > On 27.05.21 12:04, Amit Kapila wrote:
> > > > >>> Also, I am thinking that instead of a stat view, do we need
> > > > >>> to consider having a system table (pg_replication_conflicts or
> > > > >>> something like that) for this because what if stats information is
> > > > >>> lost (say either due to crash or due to udp packet loss), can we rely
> > > > >>> on stats view for this?
> > > > >> Yeah, it seems better to use a catalog.
> > > > >>
> > > > > Okay.
> > > >
> > > > Could you store it shared memory?  You don't need it to be crash safe,
> > > > since the subscription will just run into the same error again after
> > > > restart.  You just don't want it to be lost, like with the statistics
> > > > collector.
> > > >
> > >
> > > But, won't that be costly in cases where we have errors in the
> > > processing of very large transactions? Subscription has to process all
> > > the data before it gets an error.
> >
> > I had the same concern. Particularly, the approach we currently
> > discussed is to skip the transaction based on the information written
> > by the worker rather than require the user to specify the XID.
> >
>
> Yeah, but I was imagining that the user still needs to specify
> something to indicate that we need to skip it, otherwise, we might try
> to skip a transaction that the user wants to resolve by itself rather
> than expecting us to skip it.

Yeah, currently what I'm thinking is that the worker writes the
conflict that caused an error somewhere. If the user wants to resolve
it manually they can specify the resolution method to the stopped
subscription. Until the user specifies the method and the worker
resolves it or some fields of the subscription such as subconninfo are
updated, the conflict is not resolved and the information lasts.

>
> > > I think the XID (or say another identifier like commitLSN) which we
> > > want to use for skipping the transaction as specified by the user has
> > > to be stored in the catalog because otherwise, after the restart we
> > > won't remember it and the user won't know that he needs to set it
> > > again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
> > > ..), isn't it better to store all conflict-related information in a
> > > separate catalog like pg_subscription_conflict or something like that.
> > > I think it might be also better to later extend it for auto conflict
> > > resolution where the user can specify auto conflict resolution info
> > > for a subscription. Is it better to store all such information in
> > > pg_subscription or have a separate catalog? It is possible that even
> > > if we have a separate catalog for conflict info, we might not want to
> > > store error info there.
> >
> > Just to be clear, we need to store only the conflict-related
> > information that cannot be resolved without manual intervention,
> > right? That is, conflicts cause an error, exiting the workers. In
> > general, replication conflicts include also conflicts that don’t cause
> > an error. I think that those conflicts don’t necessarily need to be
> > stored in the catalog and don’t require manual intervention.
> >
>
> Yeah, I think we want to record the error cases but which other
> conflicts you are talking about here which doesn't lead to any sort of
> error?

For example, I think it's one type of replication conflict that two
updates that arrived via logical replication or from the client update
the same record (e.g., having the same primary key) at the same time.
In that case an error doesn't happen and we always choose the update
that arrived later. But there are other possible resolution methods
such as choosing the one that arrived former, using the one having a
newer commit timestamp, using something like priority of the node, and
even raising an error so that the user manually resolves it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

01 июня 2021 г., 12:47:27

On Tue, Jun 1, 2021 at 1:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 2:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 1, 2021 at 10:07 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
> > > > <peter.eisentraut@enterprisedb.com> wrote:
> > > > >
> > > > > On 27.05.21 12:04, Amit Kapila wrote:
> > > > > >>> Also, I am thinking that instead of a stat view, do we need
> > > > > >>> to consider having a system table (pg_replication_conflicts or
> > > > > >>> something like that) for this because what if stats information is
> > > > > >>> lost (say either due to crash or due to udp packet loss), can we rely
> > > > > >>> on stats view for this?
> > > > > >> Yeah, it seems better to use a catalog.
> > > > > >>
> > > > > > Okay.
> > > > >
> > > > > Could you store it shared memory?  You don't need it to be crash safe,
> > > > > since the subscription will just run into the same error again after
> > > > > restart.  You just don't want it to be lost, like with the statistics
> > > > > collector.
> > > > >
> > > >
> > > > But, won't that be costly in cases where we have errors in the
> > > > processing of very large transactions? Subscription has to process all
> > > > the data before it gets an error.
> > >
> > > I had the same concern. Particularly, the approach we currently
> > > discussed is to skip the transaction based on the information written
> > > by the worker rather than require the user to specify the XID.
> > >
> >
> > Yeah, but I was imagining that the user still needs to specify
> > something to indicate that we need to skip it, otherwise, we might try
> > to skip a transaction that the user wants to resolve by itself rather
> > than expecting us to skip it.
>
> Yeah, currently what I'm thinking is that the worker writes the
> conflict that caused an error somewhere. If the user wants to resolve
> it manually they can specify the resolution method to the stopped
> subscription. Until the user specifies the method and the worker
> resolves it or some fields of the subscription such as subconninfo are
> updated, the conflict is not resolved and the information lasts.
>

I think we can work out such details but not sure tinkering anything
with subconninfo was not in my mind.

> >
> > > > I think the XID (or say another identifier like commitLSN) which we
> > > > want to use for skipping the transaction as specified by the user has
> > > > to be stored in the catalog because otherwise, after the restart we
> > > > won't remember it and the user won't know that he needs to set it
> > > > again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
> > > > ..), isn't it better to store all conflict-related information in a
> > > > separate catalog like pg_subscription_conflict or something like that.
> > > > I think it might be also better to later extend it for auto conflict
> > > > resolution where the user can specify auto conflict resolution info
> > > > for a subscription. Is it better to store all such information in
> > > > pg_subscription or have a separate catalog? It is possible that even
> > > > if we have a separate catalog for conflict info, we might not want to
> > > > store error info there.
> > >
> > > Just to be clear, we need to store only the conflict-related
> > > information that cannot be resolved without manual intervention,
> > > right? That is, conflicts cause an error, exiting the workers. In
> > > general, replication conflicts include also conflicts that don’t cause
> > > an error. I think that those conflicts don’t necessarily need to be
> > > stored in the catalog and don’t require manual intervention.
> > >
> >
> > Yeah, I think we want to record the error cases but which other
> > conflicts you are talking about here which doesn't lead to any sort of
> > error?
>
> For example, I think it's one type of replication conflict that two
> updates that arrived via logical replication or from the client update
> the same record (e.g., having the same primary key) at the same time.
> In that case an error doesn't happen and we always choose the update
> that arrived later.
>

I think we choose whichever is earlier as we first try to find the
tuple in local rel and if not found then we silently ignore the
update/delete operation.

> But there are other possible resolution methods
> such as choosing the one that arrived former, using the one having a
> newer commit timestamp, using something like priority of the node, and
> even raising an error so that the user manually resolves it.
>

Agreed. I think we need to log only the ones which lead to error.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

01 июня 2021 г., 18:35:44

On 01.06.21 06:01, Amit Kapila wrote:
> But, won't that be costly in cases where we have errors in the
> processing of very large transactions? Subscription has to process all
> the data before it gets an error. I think we can even imagine this
> feature to be extended to use commitLSN as a skip candidate in which
> case we can even avoid getting the data of that transaction from the
> publisher. So if this information is persistent, the user can even set
> the skip identifier after the restart before the publisher can send
> all the data.

At least in current practice, skipping parts of the logical replication 
stream on the subscriber is a rare, emergency-level operation when 
something that shouldn't have happened happened.  So it doesn't really 
matter how costly it is.  It's not going to be more costly than the 
error happening in the first place.  All you'd need is one shared memory 
slot per subscription to store a xid to skip.

We will also want some proper conflict handling at some point.  But I 
think what is being discussed here is meant to be a repair tool, not a 
policy tool, and I'm afraid it might get over-engineered.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

02 июня 2021 г., 09:07:06

On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 01.06.21 06:01, Amit Kapila wrote:
> > But, won't that be costly in cases where we have errors in the
> > processing of very large transactions? Subscription has to process all
> > the data before it gets an error. I think we can even imagine this
> > feature to be extended to use commitLSN as a skip candidate in which
> > case we can even avoid getting the data of that transaction from the
> > publisher. So if this information is persistent, the user can even set
> > the skip identifier after the restart before the publisher can send
> > all the data.
>
> At least in current practice, skipping parts of the logical replication
> stream on the subscriber is a rare, emergency-level operation when
> something that shouldn't have happened happened.  So it doesn't really
> matter how costly it is.  It's not going to be more costly than the
> error happening in the first place.  All you'd need is one shared memory
> slot per subscription to store a xid to skip.
>

Leaving aside the performance point, how can we do by just storing
skip identifier (XID/commitLSN) in shared_memory? How will the apply
worker know about that information after restart? Do you expect the
user to set it again, if so, I think users might not like that? Also,
how will we prohibit users to give some identifier other than for
failed transactions, and if users provide that what should be our
action? Without that, if users provide XID of some in-progress
transaction, we might need to do more work (rollback) than just
skipping it.

> We will also want some proper conflict handling at some point.  But I
> think what is being discussed here is meant to be a repair tool, not a
> policy tool, and I'm afraid it might get over-engineered.
>

I got your point but I am also a bit skeptical that handling all
boundary cases might become tricky if we go with a simple shared
memory technique but OTOH if we can handle all such cases then it is
fine.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 июня 2021 г., 03:43:18

On Wed, Jun 2, 2021 at 3:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> >
> > On 01.06.21 06:01, Amit Kapila wrote:
> > > But, won't that be costly in cases where we have errors in the
> > > processing of very large transactions? Subscription has to process all
> > > the data before it gets an error. I think we can even imagine this
> > > feature to be extended to use commitLSN as a skip candidate in which
> > > case we can even avoid getting the data of that transaction from the
> > > publisher. So if this information is persistent, the user can even set
> > > the skip identifier after the restart before the publisher can send
> > > all the data.
> >
> > At least in current practice, skipping parts of the logical replication
> > stream on the subscriber is a rare, emergency-level operation when
> > something that shouldn't have happened happened.  So it doesn't really
> > matter how costly it is.  It's not going to be more costly than the
> > error happening in the first place.  All you'd need is one shared memory
> > slot per subscription to store a xid to skip.
> >
>
> Leaving aside the performance point, how can we do by just storing
> skip identifier (XID/commitLSN) in shared_memory? How will the apply
> worker know about that information after restart? Do you expect the
> user to set it again, if so, I think users might not like that? Also,
> how will we prohibit users to give some identifier other than for
> failed transactions, and if users provide that what should be our
> action? Without that, if users provide XID of some in-progress
> transaction, we might need to do more work (rollback) than just
> skipping it.

I think the simplest solution would be to have a fixed-size array on
the shared memory to store information of skipping transactions on the
particular subscription. Given that this feature is meant to be a
repair tool in emergency cases, 32 or 64 entries seem enough. That
information should be visible to users via a system view and each
entry is cleared once the worker has skipped the transaction. Also, we
also would need to clear the entry if the meta information of the
subscription such as conninfo and slot name has been changed. The
worker reads that information at least when starting logical
replication. The worker receives changes from the publication and
checks if the transaction should be skipped when start to apply those
changes. If so the worker skips applying all changes of the
transaction and removes stream files if exist.

Regarding the point of how to check if the specified XID by the user
is valid, I guess it’s not easy to do that since XIDs sent from the
publisher are in random order. Considering the use case of this tool,
the situation seems like the logical replication gets stuck due to a
problem transaction and the worker repeatedly restarts and raises an
error. So I guess it also would be a good idea that the user can
specify to skip the first transaction (or first N transactions) since
the subscription starts logical replication. It’s less flexible but
seems enough to solve such a situation and doesn’t have such a problem
of validating the XID. If the functionality like letting the
subscriber know the oldest XID that is possibly sent is useful also
for other purposes it would also be a good idea to implement it but
not sure about other use cases.

Anyway, it seems to me that we need to consider the user interface
first, especially how and what the user specifies the transaction to
skip. My current feeling is that specifying XID is intuitive and
flexible but the user needs to have 2 steps: checks XID and then
specifies it, and there is a risk that the user mistakenly specifies a
wrong XID. On the other hand, the idea of specifying to skip the first
transaction doesn’t require the user to check and specify XID but is
less flexible, and “the first” transaction might be ambiguous for the
user.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

16 июня 2021 г., 12:05:08

On Tue, Jun 15, 2021 at 6:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jun 2, 2021 at 3:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
> > <peter.eisentraut@enterprisedb.com> wrote:
> > >
> > > On 01.06.21 06:01, Amit Kapila wrote:
> > > > But, won't that be costly in cases where we have errors in the
> > > > processing of very large transactions? Subscription has to process all
> > > > the data before it gets an error. I think we can even imagine this
> > > > feature to be extended to use commitLSN as a skip candidate in which
> > > > case we can even avoid getting the data of that transaction from the
> > > > publisher. So if this information is persistent, the user can even set
> > > > the skip identifier after the restart before the publisher can send
> > > > all the data.
> > >
> > > At least in current practice, skipping parts of the logical replication
> > > stream on the subscriber is a rare, emergency-level operation when
> > > something that shouldn't have happened happened.  So it doesn't really
> > > matter how costly it is.  It's not going to be more costly than the
> > > error happening in the first place.  All you'd need is one shared memory
> > > slot per subscription to store a xid to skip.
> > >
> >
> > Leaving aside the performance point, how can we do by just storing
> > skip identifier (XID/commitLSN) in shared_memory? How will the apply
> > worker know about that information after restart? Do you expect the
> > user to set it again, if so, I think users might not like that? Also,
> > how will we prohibit users to give some identifier other than for
> > failed transactions, and if users provide that what should be our
> > action? Without that, if users provide XID of some in-progress
> > transaction, we might need to do more work (rollback) than just
> > skipping it.
>
> I think the simplest solution would be to have a fixed-size array on
> the shared memory to store information of skipping transactions on the
> particular subscription. Given that this feature is meant to be a
> repair tool in emergency cases, 32 or 64 entries seem enough.
>

IIUC, here you are talking about xids specified by the user to skip?
If so, then how will you get that information after the restart, and
why you need 32 or 64 entries for it?

>
> Anyway, it seems to me that we need to consider the user interface
> first, especially how and what the user specifies the transaction to
> skip. My current feeling is that specifying XID is intuitive and
> flexible but the user needs to have 2 steps: checks XID and then
> specifies it, and there is a risk that the user mistakenly specifies a
> wrong XID. On the other hand, the idea of specifying to skip the first
> transaction doesn’t require the user to check and specify XID but is
> less flexible, and “the first” transaction might be ambiguous for the
> user.
>

I see your point in allowing to specify First N transactions but OTOH,
I am slightly afraid that it might lead to skipping some useful
transactions which will make replica out-of-sync. BTW, is there any
data point for the user to check how many transactions it can skip?
Normally, we won't be able to proceed till we resolve/skip the
transaction that is generating an error. One possibility could be that
we provide some *superuser* functions like
pg_logical_replication_skip_xact()/pg_logical_replication_reset_skip_xact()
which takes subscription name/id and xid as input parameters. Then, I
think we can store this information in ReplicationState and probably
try to map to originid from subscription name/id to retrieve that
info. We can probably document that the effects of these functions
won't last after the restart. Now, if this function is used by super
users then we can probably trust that they provide the XIDs that we
can trust to be skipped but OTOH making a restriction to allow these
functions to be used by superusers might restrict the usage of this
repair tool.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 июня 2021 г., 09:24:03

On Wed, Jun 16, 2021 at 6:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 15, 2021 at 6:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jun 2, 2021 at 3:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jun 1, 2021 at 9:05 PM Peter Eisentraut
> > > <peter.eisentraut@enterprisedb.com> wrote:
> > > >
> > > > On 01.06.21 06:01, Amit Kapila wrote:
> > > > > But, won't that be costly in cases where we have errors in the
> > > > > processing of very large transactions? Subscription has to process all
> > > > > the data before it gets an error. I think we can even imagine this
> > > > > feature to be extended to use commitLSN as a skip candidate in which
> > > > > case we can even avoid getting the data of that transaction from the
> > > > > publisher. So if this information is persistent, the user can even set
> > > > > the skip identifier after the restart before the publisher can send
> > > > > all the data.
> > > >
> > > > At least in current practice, skipping parts of the logical replication
> > > > stream on the subscriber is a rare, emergency-level operation when
> > > > something that shouldn't have happened happened.  So it doesn't really
> > > > matter how costly it is.  It's not going to be more costly than the
> > > > error happening in the first place.  All you'd need is one shared memory
> > > > slot per subscription to store a xid to skip.
> > > >
> > >
> > > Leaving aside the performance point, how can we do by just storing
> > > skip identifier (XID/commitLSN) in shared_memory? How will the apply
> > > worker know about that information after restart? Do you expect the
> > > user to set it again, if so, I think users might not like that? Also,
> > > how will we prohibit users to give some identifier other than for
> > > failed transactions, and if users provide that what should be our
> > > action? Without that, if users provide XID of some in-progress
> > > transaction, we might need to do more work (rollback) than just
> > > skipping it.
> >
> > I think the simplest solution would be to have a fixed-size array on
> > the shared memory to store information of skipping transactions on the
> > particular subscription. Given that this feature is meant to be a
> > repair tool in emergency cases, 32 or 64 entries seem enough.
> >
>
> IIUC, here you are talking about xids specified by the user to skip?

Yes. I think we need to store pairs of subid and xid.

> If so, then how will you get that information after the restart, and
> why you need 32 or 64 entries for it?

That information doesn't last after the restart. I think that the
situation that DBA uses this tool would be that they fix the
subscription on the spot. Once the subscription skipped the
transaction, the entry of that information is cleared. So I’m thinking
that we don’t need to hold many entries and it does not necessarily to
be durable. I think your below idea of storing that information in
ReplicationState seems better to me.

>
> >
> > Anyway, it seems to me that we need to consider the user interface
> > first, especially how and what the user specifies the transaction to
> > skip. My current feeling is that specifying XID is intuitive and
> > flexible but the user needs to have 2 steps: checks XID and then
> > specifies it, and there is a risk that the user mistakenly specifies a
> > wrong XID. On the other hand, the idea of specifying to skip the first
> > transaction doesn’t require the user to check and specify XID but is
> > less flexible, and “the first” transaction might be ambiguous for the
> > user.
> >
>
> I see your point in allowing to specify First N transactions but OTOH,
> I am slightly afraid that it might lead to skipping some useful
> transactions which will make replica out-of-sync.

Agreed.

It might be better to skip only the first transaction.

>  BTW, is there any
> data point for the user to check how many transactions it can skip?
> Normally, we won't be able to proceed till we resolve/skip the
> transaction that is generating an error. One possibility could be that
> we provide some *superuser* functions like
> pg_logical_replication_skip_xact()/pg_logical_replication_reset_skip_xact()
> which takes subscription name/id and xid as input parameters. Then, I
> think we can store this information in ReplicationState and probably
> try to map to originid from subscription name/id to retrieve that
> info. We can probably document that the effects of these functions
> won't last after the restart.

ReplicationState seems a reasonable place to store that information.

> Now, if this function is used by super
> users then we can probably trust that they provide the XIDs that we
> can trust to be skipped but OTOH making a restriction to allow these
> functions to be used by superusers might restrict the usage of this
> repair tool.

If we specify the subscription id or name, maybe we can allow also the
owner of subscription to do that operation?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 июня 2021 г., 12:20:21

On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > Now, if this function is used by super
> > users then we can probably trust that they provide the XIDs that we
> > can trust to be skipped but OTOH making a restriction to allow these
> > functions to be used by superusers might restrict the usage of this
> > repair tool.
>
> If we specify the subscription id or name, maybe we can allow also the
> owner of subscription to do that operation?

Ah, the owner of the subscription must be superuser.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

28 июня 2021 г., 07:42:05

On Thu, Jun 17, 2021 at 6:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > > Now, if this function is used by super
> > > users then we can probably trust that they provide the XIDs that we
> > > can trust to be skipped but OTOH making a restriction to allow these
> > > functions to be used by superusers might restrict the usage of this
> > > repair tool.
> >
> > If we specify the subscription id or name, maybe we can allow also the
> > owner of subscription to do that operation?
>
> Ah, the owner of the subscription must be superuser.

I've attached PoC patches.

0001 patch introduces the ability to skip transactions on the
subscriber side. We can specify XID to the subscription by like ALTER
SUBSCRIPTION test_sub SET SKIP TRANSACTION 100. The implementation
seems straightforward except for setting origin state. After skipping
the transaction we have to update the session origin state so that we
can start streaming the transaction next to the one that we just
skipped in case of the server crash or restarting the apply worker. We
set origin state to the commit WAL record. However, since we skip all
changes we don’t write any WAL even if we call CommitTransaction() at
the end of the skipped transaction. So the patch sets the origin state
to the transaction that updates the pg_subscription system catalog to
reset the skip XID. I think we need a discussion of this part.

With 0002 and 0003 patches, we report the error information in server
logs and the stats view, respectively. 0002 patch adds errcontext for
messages that happened during applying the changes:

ERROR:  duplicate key value violates unique constraint "hoge_pkey"
DETAIL:  Key (c)=(1) already exists.
CONTEXT:  during apply of "INSERT" for relation "public.hoge" in
transaction with xid 736 committs 2021-06-27 12:12:30.053887+09

0003 patch adds pg_stat_logical_replication_error statistics view
discussed on another thread[1]. The apply worker sends the error
information to the stats collector if an error happens during applying
changes. We can check those errors as follow:

postgres(1:25250)=# select * from pg_stat_logical_replication_error;
 subname  | relid | action | xid |         last_failure
----------+-------+--------+-----+-------------------------------
 test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09
(1 row)

I added only columns required for the skipping transaction feature to
the view for now.

Please note that those patches are meant to evaluate the concept we've
discussed so far. Those don't have the doc update yet.

Regards,

[1] https://www.postgresql.org/message-id/DB35438F-9356-4841-89A0-412709EBD3AB%40enterprisedb.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Hi,

Have a few notes about pg_stat_logical_replication_error from the DBA point of view (which will use this view in the future).

1. As I understand it, this view might contain many errors related to different subscriptions. It is better to name "pg_stat_logical_replication_errors" using the plural form (like this done for stat views for tables, indexes, functions). Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) - "pg_stat_logical_replication_error" contains very common "logical replication" words, but the view contains errors related to subscriptions only. In the future there could be other kinds of errors related to logical replication, but not related to subscriptions - what will you do?

2. Add a field with database name or id - it helps to quickly understand to which database the subscription belongs.

3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum), and don't lose information about errors between view checks.

4. Add text of last error (if it will not be too expensive).

sub_2 | 12346 | 16458 | UPDATE | 845 | 12 | 2021-06-27 12:16:01.458752+09 | hmm, something goes wrong

Regards, Alexey

On Mon, Jul 5, 2021 at 2:59 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Jun 17, 2021 at 6:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jun 17, 2021 at 3:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > > Now, if this function is used by super
> > > users then we can probably trust that they provide the XIDs that we
> > > can trust to be skipped but OTOH making a restriction to allow these
> > > functions to be used by superusers might restrict the usage of this
> > > repair tool.
> >
> > If we specify the subscription id or name, maybe we can allow also the
> > owner of subscription to do that operation?
>
> Ah, the owner of the subscription must be superuser.

I've attached PoC patches.

0001 patch introduces the ability to skip transactions on the
subscriber side. We can specify XID to the subscription by like ALTER
SUBSCRIPTION test_sub SET SKIP TRANSACTION 100. The implementation
seems straightforward except for setting origin state. After skipping
the transaction we have to update the session origin state so that we
can start streaming the transaction next to the one that we just
skipped in case of the server crash or restarting the apply worker. We
set origin state to the commit WAL record. However, since we skip all
changes we don’t write any WAL even if we call CommitTransaction() at
the end of the skipped transaction. So the patch sets the origin state
to the transaction that updates the pg_subscription system catalog to
reset the skip XID. I think we need a discussion of this part.

With 0002 and 0003 patches, we report the error information in server
logs and the stats view, respectively. 0002 patch adds errcontext for
messages that happened during applying the changes:

ERROR: duplicate key value violates unique constraint "hoge_pkey"
DETAIL: Key (c)=(1) already exists.
CONTEXT: during apply of "INSERT" for relation "public.hoge" in
transaction with xid 736 committs 2021-06-27 12:12:30.053887+09

0003 patch adds pg_stat_logical_replication_error statistics view
discussed on another thread[1]. The apply worker sends the error
information to the stats collector if an error happens during applying
changes. We can check those errors as follow:

postgres(1:25250)=# select * from pg_stat_logical_replication_error;
subname | relid | action | xid | last_failure
----------+-------+--------+-----+-------------------------------
test_sub | 16384 | INSERT | 736 | 2021-06-27 12:12:45.142675+09
(1 row)

I added only columns required for the skipping transaction feature to
the view for now.

Please note that those patches are meant to evaluate the concept we've
discussed so far. Those don't have the doc update yet.

Regards,

[1] https://www.postgresql.org/message-id/DB35438F-9356-4841-89A0-412709EBD3AB%40enterprisedb.com

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

С уважением Алексей В. Лесовский

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

06 июля 2021 г., 08:58:17

On Mon, Jul 5, 2021 at 7:33 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
>
> Hi,
> Have a few notes about pg_stat_logical_replication_error from the DBA point of view (which will use this view in the
future).

Thank you for the comments!

> 1. As I understand it, this view might contain many errors related to different subscriptions. It is better to name
"pg_stat_logical_replication_errors"using the plural form (like this done for stat views for tables, indexes,
functions).

Agreed.

> Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) -
"pg_stat_logical_replication_error"contains very common "logical replication" words, but the view contains errors
relatedto subscriptions only. In the future there could be other kinds of errors related to logical replication, but
notrelated to subscriptions - what will you do? 

Is pg_stat_subscription_errors or
pg_stat_logical_replication_apply_errors better?

> 2. Add a field with database name or id - it helps to quickly understand to which database the subscription belongs.

Agreed.

> 3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum), and
don'tlose information about errors between view checks. 

Do you mean to increment the error count if the error (command, xid,
and relid) is the same as the previous one? or to have the total
number of errors per subscription? And what can we infer from the
error rates and aggregations?

> 4. Add text of last error (if it will not be too expensive).

Agreed.

> 5. Rename the "action" field to "command", as I know this is right from terminology point of view.

Okay.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

06 июля 2021 г., 09:59:49

On Mon, Jul 5, 2021 at 6:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jul 1, 2021 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Jul 1, 2021 at 12:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Don't we want to clear stats at drop subscription as well? We do drop
> > > database stats in dropdb via pgstat_drop_database, so I think we need
> > > to clear subscription stats at the time of drop subscription.
> >
> > Yes, it needs to be cleared. In the 0003 patch, pgstat_vacuum_stat()
> > sends the message to clear the stats. I think it's better to have
> > pgstat_vacuum_stat() do that job similar to dropping replication slot
> > statistics rather than relying on the single message send at DROP
> > SUBSCRIPTION. I've considered doing both: sending the message at DROP
> > SUBSCRIPTION and periodical checking by pgstat_vacuum_stat(), but
> > dropping subscription not setting a replication slot is able to
> > rollback. So we need to send it only at commit time. Given that we
> > don’t necessarily need the stats to be updated immediately, I think
> > it’s reasonable to go with only a way of pgstat_vacuum_stat().
> >
>
> Okay, that makes sense. Can we consider sending the multiple ids in
> one message as we do for relations or functions in
> pgstat_vacuum_stat()? That will reduce some message traffic.

Yes. Since subscriptions are objects that are not frequently created
and dropped I prioritized not to increase the message type. But if we
do that for subscriptions, is it better to do that for replication
slots as well? It seems to me that the lifetime of subscriptions and
replication slots are similar.

> BTW, do
> we have some way to avoid wrapping around the OID before we clean up
> via pgstat_vacuum_stat()?

As far as I know there is not.

>
>
> > > In the 0003 patch, if I am reading it correctly then the patch is not
> > > doing anything for tablesync worker. It is not clear to me at this
> > > stage what exactly we want to do about it? Do we want to just ignore
> > > errors from tablesync worker and let the system behave as it is
> > > without this feature? If we want to do anything then I think the way
> > > to skip the initial table sync would be to behave like the user has
> > > given 'copy_data' option as false.
> >
> > It might be better to have also sync workers report errors, even if
> > SKIP TRANSACTION feature doesn’t support anything for initial table
> > synchronization. From the user perspective, The initial table
> > synchronization is also the part of logical replication operations. If
> > we report only error information of applying logical changes, it could
> > confuse users.
> >
> > But I’m not sure about the way to skip the initial table
> > synchronization. Once we set `copy_data` to false, all table
> > synchronizations are disabled. Some of them might have been able to
> > synchronize successfully. It might be useful if the user can disable
> > the table initialization for the particular tables.
> >
>
> True but I guess the user can wait for all the tablesyncs to either
> finish or get an error corresponding to the table sync. After that, it
> can use 'copy_data' as false. This is not a very good method but I
> don't see any other option. I guess whatever is the case logging
> errors from tablesyncs is anyway not a bad idea.
>
> Instead of using the syntax "ALTER SUBSCRIPTION name SET SKIP
> TRANSACTION Iconst", isn't it better to use it as a subscription
> option like Mark has done for his patch (disable_on_error)?

According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
specify a subset of parameters that can be specified by CREATE
SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
cannot be done. Are you concerned about adding a syntax to ALTER
SUBSCRIPTION?

>
> I am slightly nervous about this way of allowing the user to skip the
> errors because if it is not used carefully then it can easily lead to
> inconsistent data on the subscriber. I agree that as only superusers
> will be allowed to use this option and we can document clearly the
> side-effects, the risk could be reduced but is that sufficient? It is
> not that we don't have any other tool which allows users to make their
> data inconsistent (one recent example is functions
> (heap_force_kill/heap_force_freeze) in pg_surgery module) if not used
> carefully but it might be better to not expose such tools.
>
> OTOH, if we use the error infrastructure of this patch and allow users
> to just disable the subscription on error as was proposed by Mark then
> that can't lead to any inconsistency.
>
> What do you think?

As you mentioned in another mail, what we can do with this feature is
the same as pg_replication_origin_advance(). Like there is a risk that
the user specifies a wrong LSN to pg_replication_origin_advance(),
there is a similar risk at this feature.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

06 июля 2021 г., 11:59:29

On Tue, Jul 6, 2021 at 11:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jul 5, 2021 at 7:33 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
> >
> > Hi,
> > Have a few notes about pg_stat_logical_replication_error from the DBA point of view (which will use this view in
thefuture). 
>
> Thank you for the comments!
>
> > 1. As I understand it, this view might contain many errors related to different subscriptions. It is better to name
"pg_stat_logical_replication_errors"using the plural form (like this done for stat views for tables, indexes,
functions).
>
> Agreed.
>
> > Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) -
"pg_stat_logical_replication_error"contains very common "logical replication" words, but the view contains errors
relatedto subscriptions only. In the future there could be other kinds of errors related to logical replication, but
notrelated to subscriptions - what will you do? 
>
> Is pg_stat_subscription_errors or
> pg_stat_logical_replication_apply_errors better?
>

Few more to consider: pg_stat_apply_failures,
pg_stat_subscription_failures, pg_stat_apply_conflicts,
pg_stat_subscription_conflicts.

> > 2. Add a field with database name or id - it helps to quickly understand to which database the subscription
belongs.
>
> Agreed.
>
> > 3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum), and
don'tlose information about errors between view checks. 
>
> Do you mean to increment the error count if the error (command, xid,
> and relid) is the same as the previous one? or to have the total
> number of errors per subscription?
>

I would prefer the total number of errors per subscription.

> And what can we infer from the
> error rates and aggregations?
>

Say, if we add a column like failure_type/conflict_type as well and
one would be interested in knowing how many conflicts are due to
primary key conflicts vs. update/delete conflicts.

You might want to consider keeping this view patch before the skip_xid
patch in your patch series as this will be base for the skip_xid
patch.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

06 июля 2021 г., 12:33:32

On Tue, Jul 6, 2021 at 12:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jul 5, 2021 at 6:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jul 1, 2021 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Jul 1, 2021 at 12:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Don't we want to clear stats at drop subscription as well? We do drop
> > > > database stats in dropdb via pgstat_drop_database, so I think we need
> > > > to clear subscription stats at the time of drop subscription.
> > >
> > > Yes, it needs to be cleared. In the 0003 patch, pgstat_vacuum_stat()
> > > sends the message to clear the stats. I think it's better to have
> > > pgstat_vacuum_stat() do that job similar to dropping replication slot
> > > statistics rather than relying on the single message send at DROP
> > > SUBSCRIPTION. I've considered doing both: sending the message at DROP
> > > SUBSCRIPTION and periodical checking by pgstat_vacuum_stat(), but
> > > dropping subscription not setting a replication slot is able to
> > > rollback. So we need to send it only at commit time. Given that we
> > > don’t necessarily need the stats to be updated immediately, I think
> > > it’s reasonable to go with only a way of pgstat_vacuum_stat().
> > >
> >
> > Okay, that makes sense. Can we consider sending the multiple ids in
> > one message as we do for relations or functions in
> > pgstat_vacuum_stat()? That will reduce some message traffic.
>
> Yes. Since subscriptions are objects that are not frequently created
> and dropped I prioritized not to increase the message type. But if we
> do that for subscriptions, is it better to do that for replication
> slots as well? It seems to me that the lifetime of subscriptions and
> replication slots are similar.
>

Yeah, I think it makes sense to do for both, we can work on slots
patch separately. I don't see a reason why we shouldn't send a single
message for multiple clear/drop entries.

> >
> > True but I guess the user can wait for all the tablesyncs to either
> > finish or get an error corresponding to the table sync. After that, it
> > can use 'copy_data' as false. This is not a very good method but I
> > don't see any other option. I guess whatever is the case logging
> > errors from tablesyncs is anyway not a bad idea.
> >
> > Instead of using the syntax "ALTER SUBSCRIPTION name SET SKIP
> > TRANSACTION Iconst", isn't it better to use it as a subscription
> > option like Mark has done for his patch (disable_on_error)?
>
> According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
> parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
> specify a subset of parameters that can be specified by CREATE
> SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
> be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
> cannot be done. Are you concerned about adding a syntax to ALTER
> SUBSCRIPTION?
>

Both for additional syntax and consistency with disable_on_error.
Isn't it just a current implementation that Alter only allows to
change parameters supported by Create? Is there a reason why we can't
allow Alter to set/change some parameters not supported by Create?

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Alexey Lesovsky

Дата:

06 июля 2021 г., 13:13:32

On Tue, Jul 6, 2021 at 10:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) - "pg_stat_logical_replication_error" contains very common "logical replication" words, but the view contains errors related to subscriptions only. In the future there could be other kinds of errors related to logical replication, but not related to subscriptions - what will you do?

Is pg_stat_subscription_errors or
pg_stat_logical_replication_apply_errors better?

It seems to me 'pg_stat_subscription_conflicts' proposed by Amit Kapila is the most suitable, because it directly says about conflicts occurring on the subscription side. The name 'pg_stat_subscription_errors' is also good, especially in case of further extension if some kind of similar errors will be tracked.

> 3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum), and don't lose information about errors between view checks.

Do you mean to increment the error count if the error (command, xid,
and relid) is the same as the previous one? or to have the total
number of errors per subscription? And what can we infer from the
error rates and aggregations?

To be honest, I hurried up when I wrote the first email, and read only about stats view. Later, I read the starting email about the patch and rethought this note.

As I understand, when the conflict occurs, replication stops (until conflict is resolved), an error appears in the stats view. Now, no new errors can occur in the blocked subscription. Hence, there are impossible situations when many errors (like spikes) have occurred and a user didn't see that. If I am correct in my assumption, there is no need for counters. They are necessary only when errors might occur too frequently (like pg_stat_database.deadlocks). But if this is possible, I would prefer the total number of errors per subscription, as also proposed by Amit.

Under "error rates and aggregations" I also mean in the context of when a high number of errors occured in a short period of time. If a user can read the "total errors" counter and keep this metric in his monitoring system, he will be able to calculate rates over time using functions in the monitoring system. This is extremely useful.

I also would like to clarify, when conflict is resolved - the error record is cleared or kept in the view? If it is cleared, the error counter is required (because we don't want to lose all history of errors). If it is kept - the flag telling about the error is resolved is needed (or set xid to NULL). I mean when the user is watching the view, he should be able to identify if the error has already been resolved or not.

Regards, Alexey

On Tue, Jul 6, 2021 at 10:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Jul 5, 2021 at 7:33 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
>
> Hi,
> Have a few notes about pg_stat_logical_replication_error from the DBA point of view (which will use this view in the future).

Thank you for the comments!

> 1. As I understand it, this view might contain many errors related to different subscriptions. It is better to name "pg_stat_logical_replication_errors" using the plural form (like this done for stat views for tables, indexes, functions).

Agreed.

> Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) - "pg_stat_logical_replication_error" contains very common "logical replication" words, but the view contains errors related to subscriptions only. In the future there could be other kinds of errors related to logical replication, but not related to subscriptions - what will you do?

Is pg_stat_subscription_errors or
pg_stat_logical_replication_apply_errors better?

> 2. Add a field with database name or id - it helps to quickly understand to which database the subscription belongs.

Agreed.

> 3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum), and don't lose information about errors between view checks.

Do you mean to increment the error count if the error (command, xid,
and relid) is the same as the previous one? or to have the total
number of errors per subscription? And what can we infer from the
error rates and aggregations?

> 4. Add text of last error (if it will not be too expensive).

Agreed.

> 5. Rename the "action" field to "command", as I know this is right from terminology point of view.

Okay.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

С уважением Алексей В. Лесовский

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

07 июля 2021 г., 09:17:05

On Tue, Jul 6, 2021 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jul 6, 2021 at 12:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jul 5, 2021 at 6:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Jul 1, 2021 at 6:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Jul 1, 2021 at 12:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > Don't we want to clear stats at drop subscription as well? We do drop
> > > > > database stats in dropdb via pgstat_drop_database, so I think we need
> > > > > to clear subscription stats at the time of drop subscription.
> > > >
> > > > Yes, it needs to be cleared. In the 0003 patch, pgstat_vacuum_stat()
> > > > sends the message to clear the stats. I think it's better to have
> > > > pgstat_vacuum_stat() do that job similar to dropping replication slot
> > > > statistics rather than relying on the single message send at DROP
> > > > SUBSCRIPTION. I've considered doing both: sending the message at DROP
> > > > SUBSCRIPTION and periodical checking by pgstat_vacuum_stat(), but
> > > > dropping subscription not setting a replication slot is able to
> > > > rollback. So we need to send it only at commit time. Given that we
> > > > don’t necessarily need the stats to be updated immediately, I think
> > > > it’s reasonable to go with only a way of pgstat_vacuum_stat().
> > > >
> > >
> > > Okay, that makes sense. Can we consider sending the multiple ids in
> > > one message as we do for relations or functions in
> > > pgstat_vacuum_stat()? That will reduce some message traffic.
> >
> > Yes. Since subscriptions are objects that are not frequently created
> > and dropped I prioritized not to increase the message type. But if we
> > do that for subscriptions, is it better to do that for replication
> > slots as well? It seems to me that the lifetime of subscriptions and
> > replication slots are similar.
> >
>
> Yeah, I think it makes sense to do for both, we can work on slots
> patch separately. I don't see a reason why we shouldn't send a single
> message for multiple clear/drop entries.

+1

>
> > >
> > > True but I guess the user can wait for all the tablesyncs to either
> > > finish or get an error corresponding to the table sync. After that, it
> > > can use 'copy_data' as false. This is not a very good method but I
> > > don't see any other option. I guess whatever is the case logging
> > > errors from tablesyncs is anyway not a bad idea.
> > >
> > > Instead of using the syntax "ALTER SUBSCRIPTION name SET SKIP
> > > TRANSACTION Iconst", isn't it better to use it as a subscription
> > > option like Mark has done for his patch (disable_on_error)?
> >
> > According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
> > parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
> > specify a subset of parameters that can be specified by CREATE
> > SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
> > be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
> > cannot be done. Are you concerned about adding a syntax to ALTER
> > SUBSCRIPTION?
> >
>
> Both for additional syntax and consistency with disable_on_error.
> Isn't it just a current implementation that Alter only allows to
> change parameters supported by Create? Is there a reason why we can't
> allow Alter to set/change some parameters not supported by Create?

I think there is not reason for that but looking at ALTER TABLE I
thought there is such a policy. I thought the skipping transaction
feature is somewhat different from disable_on_error feature. The
former seems a feature to deal with a problem on the spot whereas the
latter seems a setting of a subscription. Anyway, if we use the
subscription option, we can reset the XID by setting 0? Or do we need
ALTER SUBSCRIPTION RESET?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

08 июля 2021 г., 12:28:41

On Wed, Jul 7, 2021 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jul 6, 2021 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > >
> > > According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
> > > parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
> > > specify a subset of parameters that can be specified by CREATE
> > > SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
> > > be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
> > > cannot be done. Are you concerned about adding a syntax to ALTER
> > > SUBSCRIPTION?
> > >
> >
> > Both for additional syntax and consistency with disable_on_error.
> > Isn't it just a current implementation that Alter only allows to
> > change parameters supported by Create? Is there a reason why we can't
> > allow Alter to set/change some parameters not supported by Create?
>
> I think there is not reason for that but looking at ALTER TABLE I
> thought there is such a policy.
>

If we are looking for precedent then I think we allow to set
configuration parameters via Alter Database but not via Create
Database. Does that address your concern?

> I thought the skipping transaction
> feature is somewhat different from disable_on_error feature. The
> former seems a feature to deal with a problem on the spot whereas the
> latter seems a setting of a subscription. Anyway, if we use the
> subscription option, we can reset the XID by setting 0? Or do we need
> ALTER SUBSCRIPTION RESET?

The other commands like Alter Table, Alter Database, etc, which
provides a way to Set some parameter/option, have a Reset variant. I
think it would be good to have it for Alter Subscription as well but
we might want to allow other parameters to be reset by that as well.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

09 июля 2021 г., 03:27:35

On Thu, Jul 8, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jul 7, 2021 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jul 6, 2021 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > >
> > > > According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
> > > > parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
> > > > specify a subset of parameters that can be specified by CREATE
> > > > SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
> > > > be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
> > > > cannot be done. Are you concerned about adding a syntax to ALTER
> > > > SUBSCRIPTION?
> > > >
> > >
> > > Both for additional syntax and consistency with disable_on_error.
> > > Isn't it just a current implementation that Alter only allows to
> > > change parameters supported by Create? Is there a reason why we can't
> > > allow Alter to set/change some parameters not supported by Create?
> >
> > I think there is not reason for that but looking at ALTER TABLE I
> > thought there is such a policy.
> >
>
> If we are looking for precedent then I think we allow to set
> configuration parameters via Alter Database but not via Create
> Database. Does that address your concern?

Thank you for the info! But it seems like CREATE DATABASE doesn't
support SET in the first place. Also interestingly, ALTER SUBSCRIPTION
support both ENABLE/DISABLE and SET (enabled = on/off). I’m not sure
from the point of view of consistency with other CREATE, ALTER
commands, and disable_on_error but it might be better to avoid adding
additional syntax.

>
> > I thought the skipping transaction
> > feature is somewhat different from disable_on_error feature. The
> > former seems a feature to deal with a problem on the spot whereas the
> > latter seems a setting of a subscription. Anyway, if we use the
> > subscription option, we can reset the XID by setting 0? Or do we need
> > ALTER SUBSCRIPTION RESET?
>
> The other commands like Alter Table, Alter Database, etc, which
> provides a way to Set some parameter/option, have a Reset variant. I
> think it would be good to have it for Alter Subscription as well but
> we might want to allow other parameters to be reset by that as well.

Agreed.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

09 июля 2021 г., 03:42:59

On Tue, Jul 6, 2021 at 7:13 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
>
> On Tue, Jul 6, 2021 at 10:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> > Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) -
"pg_stat_logical_replication_error"contains very common "logical replication" words, but the view contains errors
relatedto subscriptions only. In the future there could be other kinds of errors related to logical replication, but
notrelated to subscriptions - what will you do? 
>>
>>
>> Is pg_stat_subscription_errors or
>> pg_stat_logical_replication_apply_errors better?
>
>
> It seems to me 'pg_stat_subscription_conflicts' proposed by Amit Kapila is the most suitable, because it directly
saysabout conflicts occurring on the subscription side. The name 'pg_stat_subscription_errors' is also good, especially
incase of further extension if some kind of similar errors will be tracked. 

I personally prefer pg_stat_subscription_errors since
pg_stat_subscription_conflicts could be used for conflict resolution
features in the future. This stats view I'm proposing is meant to
focus on errors that happened during applying logical changes. So
using the term 'errors' seems to make sense to me.

>
>>
>> > 3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum),
anddon't lose information about errors between view checks. 
>>
>> Do you mean to increment the error count if the error (command, xid,
>> and relid) is the same as the previous one? or to have the total
>> number of errors per subscription? And what can we infer from the
>> error rates and aggregations?
>
>
> To be honest, I hurried up when I wrote the first email, and read only about stats view. Later, I read the starting
emailabout the patch and rethought this note. 
>
> As I understand, when the conflict occurs, replication stops (until conflict is resolved), an error appears in the
statsview. Now, no new errors can occur in the blocked subscription. Hence, there are impossible situations when many
errors(like spikes) have occurred and a user didn't see that. If I am correct in my assumption, there is no need for
counters.They are necessary only when errors might occur too frequently (like pg_stat_database.deadlocks). But if this
ispossible, I would prefer the total number of errors per subscription, as also proposed by Amit. 

Yeah, the total number of errors seems better.

>
> Under "error rates and aggregations" I also mean in the context of when a high number of errors occured in a short
periodof time. If a user can read the "total errors" counter and keep this metric in his monitoring system, he will be
ableto calculate rates over time using functions in the monitoring system. This is extremely useful. 

Thanks for your explanation. Agreed. But the rate depends on
wal_retrieve_retry_interval so is not likely to be high in practice.

> I also would like to clarify, when conflict is resolved - the error record is cleared or kept in the view? If it is
cleared,the error counter is required (because we don't want to lose all history of errors). If it is kept - the flag
tellingabout the error is resolved is needed (or set xid to NULL). I mean when the user is watching the view, he should
beable to identify if the error has already been resolved or not. 

With the current patch, once the conflict is resolved by skipping the
transaction in question, its entry on the stats view is cleared. As
you suggested, if we have the total error counts in that view, it
would be good to keep the count and clear other fields such as xid,
last_failure, and command etc.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Alexey Lesovsky

Дата:

09 июля 2021 г., 06:32:19

On Fri, Jul 9, 2021 at 5:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jul 6, 2021 at 7:13 PM Alexey Lesovsky <lesovsky@gmail.com> wrote:
>
> On Tue, Jul 6, 2021 at 10:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> > Also, I'd like to suggest thinking twice about the view name (and function used in view DDL) - "pg_stat_logical_replication_error" contains very common "logical replication" words, but the view contains errors related to subscriptions only. In the future there could be other kinds of errors related to logical replication, but not related to subscriptions - what will you do?
>>
>>
>> Is pg_stat_subscription_errors or
>> pg_stat_logical_replication_apply_errors better?
>
>
> It seems to me 'pg_stat_subscription_conflicts' proposed by Amit Kapila is the most suitable, because it directly says about conflicts occurring on the subscription side. The name 'pg_stat_subscription_errors' is also good, especially in case of further extension if some kind of similar errors will be tracked.

I personally prefer pg_stat_subscription_errors since
pg_stat_subscription_conflicts could be used for conflict resolution
features in the future. This stats view I'm proposing is meant to
focus on errors that happened during applying logical changes. So
using the term 'errors' seems to make sense to me.

Agreed

>
>>
>> > 3. Add a counter field with total number of errors - it helps to calculate errors rates and aggregations (sum), and don't lose information about errors between view checks.
>>
>> Do you mean to increment the error count if the error (command, xid,
>> and relid) is the same as the previous one? or to have the total
>> number of errors per subscription? And what can we infer from the
>> error rates and aggregations?
>
>
> To be honest, I hurried up when I wrote the first email, and read only about stats view. Later, I read the starting email about the patch and rethought this note.
>
> As I understand, when the conflict occurs, replication stops (until conflict is resolved), an error appears in the stats view. Now, no new errors can occur in the blocked subscription. Hence, there are impossible situations when many errors (like spikes) have occurred and a user didn't see that. If I am correct in my assumption, there is no need for counters. They are necessary only when errors might occur too frequently (like pg_stat_database.deadlocks). But if this is possible, I would prefer the total number of errors per subscription, as also proposed by Amit.

Yeah, the total number of errors seems better.

Agreed

>
> Under "error rates and aggregations" I also mean in the context of when a high number of errors occured in a short period of time. If a user can read the "total errors" counter and keep this metric in his monitoring system, he will be able to calculate rates over time using functions in the monitoring system. This is extremely useful.

Thanks for your explanation. Agreed. But the rate depends on
wal_retrieve_retry_interval so is not likely to be high in practice.

Agreed

> I also would like to clarify, when conflict is resolved - the error record is cleared or kept in the view? If it is cleared, the error counter is required (because we don't want to lose all history of errors). If it is kept - the flag telling about the error is resolved is needed (or set xid to NULL). I mean when the user is watching the view, he should be able to identify if the error has already been resolved or not.

With the current patch, once the conflict is resolved by skipping the
transaction in question, its entry on the stats view is cleared. As
you suggested, if we have the total error counts in that view, it
would be good to keep the count and clear other fields such as xid,
last_failure, and command etc.

Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same subscription, but different relations? After resolution all these records are kept or they will be merged into a single record (because subscription was the same for all errors)?

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Regards, Alexey Lesovsky

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

12 июля 2021 г., 06:31:23

On Fri, Jul 9, 2021 at 5:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jul 8, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jul 7, 2021 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Jul 6, 2021 at 6:33 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > >
> > > > > According to the doc, ALTER SUBSCRIPTION ... SET is used to alter
> > > > > parameters originally set by CREATE SUBSCRIPTION. Therefore, we can
> > > > > specify a subset of parameters that can be specified by CREATE
> > > > > SUBSCRIPTION. It makes sense to me for 'disable_on_error' since it can
> > > > > be specified by CREATE SUBSCRIPTION. Whereas SKIP TRANSACTION stuff
> > > > > cannot be done. Are you concerned about adding a syntax to ALTER
> > > > > SUBSCRIPTION?
> > > > >
> > > >
> > > > Both for additional syntax and consistency with disable_on_error.
> > > > Isn't it just a current implementation that Alter only allows to
> > > > change parameters supported by Create? Is there a reason why we can't
> > > > allow Alter to set/change some parameters not supported by Create?
> > >
> > > I think there is not reason for that but looking at ALTER TABLE I
> > > thought there is such a policy.
> > >
> >
> > If we are looking for precedent then I think we allow to set
> > configuration parameters via Alter Database but not via Create
> > Database. Does that address your concern?
>
> Thank you for the info! But it seems like CREATE DATABASE doesn't
> support SET in the first place. Also interestingly, ALTER SUBSCRIPTION
> support both ENABLE/DISABLE and SET (enabled = on/off).
>

I think that is redundant but not sure if there is any reason behind doing so.

> I’m not sure
> from the point of view of consistency with other CREATE, ALTER
> commands, and disable_on_error but it might be better to avoid adding
> additional syntax.
>

If we can avoid introducing new syntax that in itself is a good reason
to introduce it as an option.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

12 июля 2021 г., 06:36:30

On Fri, Jul 9, 2021 at 9:02 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
>
> On Fri, Jul 9, 2021 at 5:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> > I also would like to clarify, when conflict is resolved - the error record is cleared or kept in the view? If it
iscleared, the error counter is required (because we don't want to lose all history of errors). If it is kept - the
flagtelling about the error is resolved is needed (or set xid to NULL). I mean when the user is watching the view, he
shouldbe able to identify if the error has already been resolved or not. 
>>
>> With the current patch, once the conflict is resolved by skipping the
>> transaction in question, its entry on the stats view is cleared. As
>> you suggested, if we have the total error counts in that view, it
>> would be good to keep the count and clear other fields such as xid,
>> last_failure, and command etc.
>
>
> Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same
subscription,but different relations? 
>

We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors. However, there is an
exception to it which is during initial table sync and I think the
view should have separate rows for each table sync.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Alexey Lesovsky

Дата:

12 июля 2021 г., 07:07:18

On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

>
> Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same subscription, but different relations?
>

We can't proceed unless the first error is resolved, so there
shouldn't be multiple unresolved errors.

Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with max_sync_workers_per_subscription > 1).

Regards, Alexey

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

12 июля 2021 г., 07:15:30

On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
>
> On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> >
>> > Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same
subscription,but different relations?
 
>> >
>>
>> We can't proceed unless the first error is resolved, so there
>> shouldn't be multiple unresolved errors.
>
>
> Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with
max_sync_workers_per_subscription> 1).
 
>

Yeah, that is possible but that covers under the second condition
mentioned by me and in such cases I think we should have separate rows
for each tablesync. Is that right, Sawada-san or do you have something
else in mind?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

12 июля 2021 г., 08:42:42

On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
> >
> > On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >>
> >> >
> >> > Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the same
subscription,but different relations?

> >> >
> >>
> >> We can't proceed unless the first error is resolved, so there
> >> shouldn't be multiple unresolved errors.
> >
> >
> > Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with
max_sync_workers_per_subscription> 1).

> >
>
> Yeah, that is possible but that covers under the second condition
> mentioned by me and in such cases I think we should have separate rows
> for each tablesync. Is that right, Sawada-san or do you have something
> else in mind?

Yeah, I agree to have separate rows for each table sync. The table
should not be processed by both the table sync worker and the apply
worker at a time so the pair of subscription OID and relation OID will
be unique. I think that we have a boolean column in the view,
indicating whether the error entry is reported by the table sync
worker or the apply worker, or maybe we also can have the action
column show "TABLE SYNC" if the error is reported by the table sync
worker.

When it comes to removing the subscription errors in
pgstat_vacuum_stat(), I think we need to seq scan on the hash table
and send the messages to purge the subscription error entries.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

12 июля 2021 г., 14:51:53

On Mon, Jul 12, 2021 at 11:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
> > >
> > > On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >>
> > >> >
> > >> > Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the
samesubscription, but different relations?
 
> > >> >
> > >>
> > >> We can't proceed unless the first error is resolved, so there
> > >> shouldn't be multiple unresolved errors.
> > >
> > >
> > > Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with
max_sync_workers_per_subscription> 1).
 
> > >
> >
> > Yeah, that is possible but that covers under the second condition
> > mentioned by me and in such cases I think we should have separate rows
> > for each tablesync. Is that right, Sawada-san or do you have something
> > else in mind?
>
> Yeah, I agree to have separate rows for each table sync. The table
> should not be processed by both the table sync worker and the apply
> worker at a time so the pair of subscription OID and relation OID will
> be unique. I think that we have a boolean column in the view,
> indicating whether the error entry is reported by the table sync
> worker or the apply worker, or maybe we also can have the action
> column show "TABLE SYNC" if the error is reported by the table sync
> worker.
>

Or similar to backend_type (text) in pg_stat_activity, we can have
something like error_source (text) which will display apply worker or
tablesync worker? I think if we have this column then even if there is
a chance that both apply and sync worker operates on the same
relation, we can identify it via this column.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

14 июля 2021 г., 11:14:32

On Mon, Jul 12, 2021 at 8:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jul 12, 2021 at 11:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
> > > >
> > > > On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >>
> > > >> >
> > > >> > Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the
samesubscription, but different relations?
 
> > > >> >
> > > >>
> > > >> We can't proceed unless the first error is resolved, so there
> > > >> shouldn't be multiple unresolved errors.
> > > >
> > > >
> > > > Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with
max_sync_workers_per_subscription> 1).
 
> > > >
> > >
> > > Yeah, that is possible but that covers under the second condition
> > > mentioned by me and in such cases I think we should have separate rows
> > > for each tablesync. Is that right, Sawada-san or do you have something
> > > else in mind?
> >
> > Yeah, I agree to have separate rows for each table sync. The table
> > should not be processed by both the table sync worker and the apply
> > worker at a time so the pair of subscription OID and relation OID will
> > be unique. I think that we have a boolean column in the view,
> > indicating whether the error entry is reported by the table sync
> > worker or the apply worker, or maybe we also can have the action
> > column show "TABLE SYNC" if the error is reported by the table sync
> > worker.
> >
>
> Or similar to backend_type (text) in pg_stat_activity, we can have
> something like error_source (text) which will display apply worker or
> tablesync worker? I think if we have this column then even if there is
> a chance that both apply and sync worker operates on the same
> relation, we can identify it via this column.

Sounds good. I'll incorporate this in the next version patch that I'm
planning to submit this week.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

16 июля 2021 г., 18:02:58

On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jul 12, 2021 at 8:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Jul 12, 2021 at 11:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
> > > > >
> > > > > On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >>
> > > > >> >
> > > > >> > Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in the
samesubscription, but different relations? 
> > > > >> >
> > > > >>
> > > > >> We can't proceed unless the first error is resolved, so there
> > > > >> shouldn't be multiple unresolved errors.
> > > > >
> > > > >
> > > > > Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with
max_sync_workers_per_subscription> 1). 
> > > > >
> > > >
> > > > Yeah, that is possible but that covers under the second condition
> > > > mentioned by me and in such cases I think we should have separate rows
> > > > for each tablesync. Is that right, Sawada-san or do you have something
> > > > else in mind?
> > >
> > > Yeah, I agree to have separate rows for each table sync. The table
> > > should not be processed by both the table sync worker and the apply
> > > worker at a time so the pair of subscription OID and relation OID will
> > > be unique. I think that we have a boolean column in the view,
> > > indicating whether the error entry is reported by the table sync
> > > worker or the apply worker, or maybe we also can have the action
> > > column show "TABLE SYNC" if the error is reported by the table sync
> > > worker.
> > >
> >
> > Or similar to backend_type (text) in pg_stat_activity, we can have
> > something like error_source (text) which will display apply worker or
> > tablesync worker? I think if we have this column then even if there is
> > a chance that both apply and sync worker operates on the same
> > relation, we can identify it via this column.
>
> Sounds good. I'll incorporate this in the next version patch that I'm
> planning to submit this week.

Sorry, I could not make it this week. I'll submit them early next week.
While updating the patch I thought we need to have more design
discussion on two points of clearing error details after the error is
resolved:

1. How to clear apply worker errors. IIUC we've discussed that once
the apply worker skipped the transaction we leave the error entry
itself but clear its fields except for some fields such as failure
counts. But given that the stats messages could be lost, how can we
ensure to clear those error details? For table sync workers’ error, we
can have autovacuum workers periodically check entires of
pg_subscription_rel and clear the error entry if the table sync worker
completes table sync (i.g., checking if srsubstate = ‘r’). But there
is no such information for the apply workers and subscriptions. In
addition to sending the message clearing the error details just after
skipping the transaction, I thought that we can have apply workers
periodically send the message clearing the error details but it seems
not good.

2. Do we really want to leave the table sync worker even after the
error is resolved and the table sync completes? Unlike the apply
worker error, the number of table sync worker errors could be very
large, for example, if a subscriber subscribes to many tables. If we
leave those errors in the stats view, it uses more memory space and
could affect writing and reading stats file performance. If such left
table sync error entries are not helpful in practice I think we can
remove them rather than clear some fields. What do you think?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

19 июля 2021 г., 08:22:35

On Fri, Jul 16, 2021 at 8:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Sounds good. I'll incorporate this in the next version patch that I'm
> > planning to submit this week.
>
> Sorry, I could not make it this week. I'll submit them early next week.
>

No problem.

> While updating the patch I thought we need to have more design
> discussion on two points of clearing error details after the error is
> resolved:
>
> 1. How to clear apply worker errors. IIUC we've discussed that once
> the apply worker skipped the transaction we leave the error entry
> itself but clear its fields except for some fields such as failure
> counts. But given that the stats messages could be lost, how can we
> ensure to clear those error details? For table sync workers’ error, we
> can have autovacuum workers periodically check entires of
> pg_subscription_rel and clear the error entry if the table sync worker
> completes table sync (i.g., checking if srsubstate = ‘r’). But there
> is no such information for the apply workers and subscriptions.
>

But won't the corresponding subscription (pg_subscription) have the
XID as InvalidTransactionid once the xid is skipped or at least a
different XID then we would have in pg_stat view? Can we use that to
reset entry via vacuum?

> In
> addition to sending the message clearing the error details just after
> skipping the transaction, I thought that we can have apply workers
> periodically send the message clearing the error details but it seems
> not good.
>

Yeah, such things should be a last resort.

> 2. Do we really want to leave the table sync worker even after the
> error is resolved and the table sync completes? Unlike the apply
> worker error, the number of table sync worker errors could be very
> large, for example, if a subscriber subscribes to many tables. If we
> leave those errors in the stats view, it uses more memory space and
> could affect writing and reading stats file performance. If such left
> table sync error entries are not helpful in practice I think we can
> remove them rather than clear some fields. What do you think?
>

Sounds reasonable to me. One might think to update the subscription
error count by including table_sync errors but not sure if that is
helpful and even if that is helpful, we can extend it later.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

19 июля 2021 г., 09:35:47

On Mon, Jul 19, 2021 at 2:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jul 16, 2021 at 8:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Sounds good. I'll incorporate this in the next version patch that I'm
> > > planning to submit this week.
> >
> > Sorry, I could not make it this week. I'll submit them early next week.
> >
>
> No problem.
>
> > While updating the patch I thought we need to have more design
> > discussion on two points of clearing error details after the error is
> > resolved:
> >
> > 1. How to clear apply worker errors. IIUC we've discussed that once
> > the apply worker skipped the transaction we leave the error entry
> > itself but clear its fields except for some fields such as failure
> > counts. But given that the stats messages could be lost, how can we
> > ensure to clear those error details? For table sync workers’ error, we
> > can have autovacuum workers periodically check entires of
> > pg_subscription_rel and clear the error entry if the table sync worker
> > completes table sync (i.g., checking if srsubstate = ‘r’). But there
> > is no such information for the apply workers and subscriptions.
> >
>
> But won't the corresponding subscription (pg_subscription) have the
> XID as InvalidTransactionid once the xid is skipped or at least a
> different XID then we would have in pg_stat view? Can we use that to
> reset entry via vacuum?

I think the XID is InvalidTransaction until the user specifies it. So
I think we cannot know whether we're before skipping or after skipping
only by the transaction ID. No?

>
> > In
> > addition to sending the message clearing the error details just after
> > skipping the transaction, I thought that we can have apply workers
> > periodically send the message clearing the error details but it seems
> > not good.
> >
>
> Yeah, such things should be a last resort.
>
> > 2. Do we really want to leave the table sync worker even after the
> > error is resolved and the table sync completes? Unlike the apply
> > worker error, the number of table sync worker errors could be very
> > large, for example, if a subscriber subscribes to many tables. If we
> > leave those errors in the stats view, it uses more memory space and
> > could affect writing and reading stats file performance. If such left
> > table sync error entries are not helpful in practice I think we can
> > remove them rather than clear some fields. What do you think?
> >
>
> Sounds reasonable to me. One might think to update the subscription
> error count by including table_sync errors but not sure if that is
> helpful and even if that is helpful, we can extend it later.

Agreed.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

19 июля 2021 г., 09:39:30

On Sat, Jul 17, 2021 at 12:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jul 14, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jul 12, 2021 at 8:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Jul 12, 2021 at 11:13 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Jul 12, 2021 at 1:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Mon, Jul 12, 2021 at 9:37 AM Alexey Lesovsky <lesovsky@gmail.com> wrote:
> > > > > >
> > > > > > On Mon, Jul 12, 2021 at 8:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >>
> > > > > >> >
> > > > > >> > Ok, looks nice. But I am curious how this will work in the case when there are two (or more) errors in
thesame subscription, but different relations? 
> > > > > >> >
> > > > > >>
> > > > > >> We can't proceed unless the first error is resolved, so there
> > > > > >> shouldn't be multiple unresolved errors.
> > > > > >
> > > > > >
> > > > > > Ok. I thought multiple errors are possible when many tables are initialized using parallel workers (with
max_sync_workers_per_subscription> 1). 
> > > > > >
> > > > >
> > > > > Yeah, that is possible but that covers under the second condition
> > > > > mentioned by me and in such cases I think we should have separate rows
> > > > > for each tablesync. Is that right, Sawada-san or do you have something
> > > > > else in mind?
> > > >
> > > > Yeah, I agree to have separate rows for each table sync. The table
> > > > should not be processed by both the table sync worker and the apply
> > > > worker at a time so the pair of subscription OID and relation OID will
> > > > be unique. I think that we have a boolean column in the view,
> > > > indicating whether the error entry is reported by the table sync
> > > > worker or the apply worker, or maybe we also can have the action
> > > > column show "TABLE SYNC" if the error is reported by the table sync
> > > > worker.
> > > >
> > >
> > > Or similar to backend_type (text) in pg_stat_activity, we can have
> > > something like error_source (text) which will display apply worker or
> > > tablesync worker? I think if we have this column then even if there is
> > > a chance that both apply and sync worker operates on the same
> > > relation, we can identify it via this column.
> >
> > Sounds good. I'll incorporate this in the next version patch that I'm
> > planning to submit this week.
>
> Sorry, I could not make it this week. I'll submit them early next week.
> While updating the patch I thought we need to have more design
> discussion on two points of clearing error details after the error is
> resolved:
>
> 1. How to clear apply worker errors. IIUC we've discussed that once
> the apply worker skipped the transaction we leave the error entry
> itself but clear its fields except for some fields such as failure
> counts. But given that the stats messages could be lost, how can we
> ensure to clear those error details? For table sync workers’ error, we
> can have autovacuum workers periodically check entires of
> pg_subscription_rel and clear the error entry if the table sync worker
> completes table sync (i.g., checking if srsubstate = ‘r’). But there
> is no such information for the apply workers and subscriptions. In
> addition to sending the message clearing the error details just after
> skipping the transaction, I thought that we can have apply workers
> periodically send the message clearing the error details but it seems
> not good.

I think that the motivation behind the idea of leaving error entries
and clearing theirs some fields is that users can check if the error
is successfully resolved and the worker is working find. But we can
check it also in another way, for example, checking
pg_stat_subscription view. So is it worth considering leaving the
apply worker errors as they are?

>
> 2. Do we really want to leave the table sync worker even after the
> error is resolved and the table sync completes? Unlike the apply
> worker error, the number of table sync worker errors could be very
> large, for example, if a subscriber subscribes to many tables. If we
> leave those errors in the stats view, it uses more memory space and
> could affect writing and reading stats file performance. If such left
> table sync error entries are not helpful in practice I think we can
> remove them rather than clear some fields. What do you think?
>

I've attached the updated version patch that incorporated all comments
I got so far except for the clearing error details part I mentioned
above. After getting a consensus on those parts, I'll incorporate the
idea into the patches.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Thu, Jul 22, 2021 at 8:53 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On July 20, 2021 9:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Mon, Jul 19, 2021 at 8:38 PM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On July 19, 2021 2:40 PM Masahiko Sawada <sawada.mshk@gmail.com>
> > wrote:
> > > > I've attached the updated version patch that incorporated all
> > > > comments I got so far except for the clearing error details part I
> > > > mentioned above. After getting a consensus on those parts, I'll
> > > > incorporate the idea into the patches.
> > >
> > > 3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
> > >    assigned, and then will the change be skipped when the xid is assigned in
> > >    the future even if it doesn't cause any conflicts ?
> >
> > Yes. Currently, setting a correct xid is the user's responsibility. I think it would
> > be better to disable it or emit WARNING/ERROR when the user mistakenly set
> > the wrong xid if we find out a convenient way to detect that.
>
> Thanks for the explanation. As Amit suggested, it seems we can document the
> risk of misusing skip_xid. Besides, I found some minor things in the patch.
>
> 1) In 0002 patch
>
> + */
> +static void
> +pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
> +{
> +       if (subscriptionErrHash != NULL)
> +               return;
> +
>
> +static void
> +pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
> +{
>
> the second paramater "len" seems not used in the function
> pgstat_recv_subscription_purge() and pgstat_recv_subscription_error().
>

'len' is not used at all in not only functions the patch added but
also other pgstat_recv_* functions. Can we remove all of them in a
separate patch? 'len' in pgstat_recv_* functions has never been used
since the stats collector code is introduced. It seems like that it
was mistakenly introduced in the first commit and other pgstat_recv_*
functions were added that followed it to define ‘len’ but didn’t also
use it at all.

>
> 2) in 0003 patch
>
>   * Helper function for apply_handle_commit and apply_handle_stream_commit.
>   */
>  static void
> -apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data)
> +apply_handle_commit_internal(LogicalRepCommitData *commit_data)
>  {
>
> This looks like a separate change which remove unused paramater in existing
> code, maybe we can get this committed first ?

Yeah, it seems to be introduced by commit 0926e96c493. I've attached
the patch for that.

Also, I've attached the updated version patches. This version patch
has pg_stat_reset_subscription_error() SQL function and sends a clear
message after skipping the transaction. 0004 patch includes the
skipping transaction feature and introducing RESET to ALTER
SUBSCRIPTION. It would be better to separate them.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

29 июля 2021 г., 08:04:50

On Mon, Jul 26, 2021 at 11:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jul 22, 2021 at 8:53 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On July 20, 2021 9:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > On Mon, Jul 19, 2021 at 8:38 PM houzj.fnst@fujitsu.com
> > > <houzj.fnst@fujitsu.com> wrote:
> > > >
> > > > On July 19, 2021 2:40 PM Masahiko Sawada <sawada.mshk@gmail.com>
> > > wrote:
> > > > > I've attached the updated version patch that incorporated all
> > > > > comments I got so far except for the clearing error details part I
> > > > > mentioned above. After getting a consensus on those parts, I'll
> > > > > incorporate the idea into the patches.
> > > >
> > > > 3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
> > > >    assigned, and then will the change be skipped when the xid is assigned in
> > > >    the future even if it doesn't cause any conflicts ?
> > >
> > > Yes. Currently, setting a correct xid is the user's responsibility. I think it would
> > > be better to disable it or emit WARNING/ERROR when the user mistakenly set
> > > the wrong xid if we find out a convenient way to detect that.
> >
> > Thanks for the explanation. As Amit suggested, it seems we can document the
> > risk of misusing skip_xid. Besides, I found some minor things in the patch.
> >
> > 1) In 0002 patch
> >
> > + */
> > +static void
> > +pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
> > +{
> > +       if (subscriptionErrHash != NULL)
> > +               return;
> > +
> >
> > +static void
> > +pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
> > +{
> >
> > the second paramater "len" seems not used in the function
> > pgstat_recv_subscription_purge() and pgstat_recv_subscription_error().
> >
>
> 'len' is not used at all in not only functions the patch added but
> also other pgstat_recv_* functions. Can we remove all of them in a
> separate patch? 'len' in pgstat_recv_* functions has never been used
> since the stats collector code is introduced. It seems like that it
> was mistakenly introduced in the first commit and other pgstat_recv_*
> functions were added that followed it to define ‘len’ but didn’t also
> use it at all.
>
> >
> > 2) in 0003 patch
> >
> >   * Helper function for apply_handle_commit and apply_handle_stream_commit.
> >   */
> >  static void
> > -apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data)
> > +apply_handle_commit_internal(LogicalRepCommitData *commit_data)
> >  {
> >
> > This looks like a separate change which remove unused paramater in existing
> > code, maybe we can get this committed first ?
>
> Yeah, it seems to be introduced by commit 0926e96c493. I've attached
> the patch for that.
>
> Also, I've attached the updated version patches. This version patch
> has pg_stat_reset_subscription_error() SQL function and sends a clear
> message after skipping the transaction. 0004 patch includes the
> skipping transaction feature and introducing RESET to ALTER
> SUBSCRIPTION. It would be better to separate them.
>

I've attached the new version patches that fix cfbot failure.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

29 июля 2021 г., 08:47:35

On Thu, Jul 29, 2021 at 2:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jul 26, 2021 at 11:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Jul 22, 2021 at 8:53 PM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On July 20, 2021 9:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > On Mon, Jul 19, 2021 at 8:38 PM houzj.fnst@fujitsu.com
> > > > <houzj.fnst@fujitsu.com> wrote:
> > > > >
> > > > > On July 19, 2021 2:40 PM Masahiko Sawada <sawada.mshk@gmail.com>
> > > > wrote:
> > > > > > I've attached the updated version patch that incorporated all
> > > > > > comments I got so far except for the clearing error details part I
> > > > > > mentioned above. After getting a consensus on those parts, I'll
> > > > > > incorporate the idea into the patches.
> > > > >
> > > > > 3) For 0003 patch, if user set skip_xid to a wrong xid which have not been
> > > > >    assigned, and then will the change be skipped when the xid is assigned in
> > > > >    the future even if it doesn't cause any conflicts ?
> > > >
> > > > Yes. Currently, setting a correct xid is the user's responsibility. I think it would
> > > > be better to disable it or emit WARNING/ERROR when the user mistakenly set
> > > > the wrong xid if we find out a convenient way to detect that.
> > >
> > > Thanks for the explanation. As Amit suggested, it seems we can document the
> > > risk of misusing skip_xid. Besides, I found some minor things in the patch.
> > >
> > > 1) In 0002 patch
> > >
> > > + */
> > > +static void
> > > +pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
> > > +{
> > > +       if (subscriptionErrHash != NULL)
> > > +               return;
> > > +
> > >
> > > +static void
> > > +pgstat_recv_subscription_error(PgStat_MsgSubscriptionErr *msg, int len)
> > > +{
> > >
> > > the second paramater "len" seems not used in the function
> > > pgstat_recv_subscription_purge() and pgstat_recv_subscription_error().
> > >
> >
> > 'len' is not used at all in not only functions the patch added but
> > also other pgstat_recv_* functions. Can we remove all of them in a
> > separate patch? 'len' in pgstat_recv_* functions has never been used
> > since the stats collector code is introduced. It seems like that it
> > was mistakenly introduced in the first commit and other pgstat_recv_*
> > functions were added that followed it to define ‘len’ but didn’t also
> > use it at all.
> >
> > >
> > > 2) in 0003 patch
> > >
> > >   * Helper function for apply_handle_commit and apply_handle_stream_commit.
> > >   */
> > >  static void
> > > -apply_handle_commit_internal(StringInfo s, LogicalRepCommitData *commit_data)
> > > +apply_handle_commit_internal(LogicalRepCommitData *commit_data)
> > >  {
> > >
> > > This looks like a separate change which remove unused paramater in existing
> > > code, maybe we can get this committed first ?
> >
> > Yeah, it seems to be introduced by commit 0926e96c493. I've attached
> > the patch for that.
> >
> > Also, I've attached the updated version patches. This version patch
> > has pg_stat_reset_subscription_error() SQL function and sends a clear
> > message after skipping the transaction. 0004 patch includes the
> > skipping transaction feature and introducing RESET to ALTER
> > SUBSCRIPTION. It would be better to separate them.
> >
>
> I've attached the new version patches that fix cfbot failure.

Sorry I've attached wrong ones. Reattached the correct version patches.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Mon, Aug 2, 2021 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Aug 2, 2021 at 7:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Jul 30, 2021 at 12:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Jul 29, 2021 at 11:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Setting up logical rep error context in a generic function looks a bit
> > > odd to me. Do we really need to set up error context here? I
> > > understand we can't do this in caller but anyway I think we are not
> > > sending this to logical replication view as well, so not sure we need
> > > to do it here.
> >
> > Yeah, I'm not convinced of this part yet. I wanted to show relid also
> > in truncate cases but I came up with only this idea.
> >
> > If an error happens during truncating the table (in
> > ExecuteTruncateGuts()), relid set by
> > set_logicalrep_error_context_rel() is actually sent to the view. If we
> > don’t have it, the view always shows relid as NULL in truncate cases.
> > On the other hand, it doesn’t cover all cases. For example, it doesn’t
> > cover an error that the target table doesn’t exist on the subscriber,
> > which happens when opening the target table. Anyway, in most cases,
> > even if relid is NULL, the error message in the view helps users to
> > know which relation the error happened on. What do you think?
> >
>
> Yeah, I also think at this stage error message is sufficient in such cases.

I've attached new patches that incorporate all comments I got so far.
Please review them.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Thu, Aug 5, 2021 at 5:58 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Tuesday, August 3, 2021 3:49 PM  Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached new patches that incorporate all comments I got so far.
> > Please review them.
> Hi, I had a chance to look at the patch-set during my other development.
> Just let me share some minor cosmetic things.

Thank you for reviewing the patches!

>
>
> [1] unnatural wording ? in v5-0002.
> + * create tells whether to create the new subscription entry if it is not
> + * create tells whether to create the new subscription relation entry if it is
>
> I'm not sure if this wording is correct or not.
> You meant just "tells whether to create ...." ?,
> although we already have 1 other "create tells" in HEAD.

create here means the function argument of
pgstat_get_subscription_entry() and
pgstat_get_subscription_error_entry(). That is, the function argument
'create' tells whether to create the new entry if not found. I
single-quoted the 'create' to avoid confusion.g

>
> [2] typo "kep" in v05-0002.
>
> I think you meant "kept" in below sentence.
>
> +/*
> + * Subscription error statistics kep in the stats collector.  One entry represents
> + * an error that happened during logical replication, reported by the apply worker
> + * (subrelid is InvalidOid) or by the table sync worker (subrelid is a valid OID).

Fixed.

>
> [3] typo "lotigcal" in the v05-0004 commit message.
>
> If incoming change violates any constraint, lotigcal replication stops
> until it's resolved. This commit introduces another way to skip the
> transaction in question.
>
> It should be "logical".

Fixed.

>
> [4] warning of doc build
>
> I've gotten an output like below during my process of make html.
> Could you please check this ?
>
> Link element has no content and no Endterm. Nothing to show in the link to monitoring-pg-stat-subscription-errors

Fixed.

I've attached the latest patches that incorporated all comments I got
so far. Please review them.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Tue, Aug 10, 2021 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 10, 2021 at 11:59 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Aug 10, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached the latest patches that incorporated all comments I got
> > > so far. Please review them.
> > >
> >
> > I am not able to apply the latest patch
> > (v6-0001-Add-errcontext-to-errors-happening-during-applyin) on HEAD,
> > getting the below error:
> >
>
> Few comments on v6-0001-Add-errcontext-to-errors-happening-during-applyin

Thank you for the comments!

> ==============================================================
>
> 1. While applying DML operations, we are setting up the error context
> multiple times due to which the context information is not
> appropriate. The first is set in apply_dispatch and then during
> processing, we set another error callback slot_store_error_callback in
> slot_store_data and slot_modify_data. When I forced one of the errors
> in slot_store_data(), it displays the below information in CONTEXT
> which doesn't make much sense.
>
> 2021-08-10 15:16:39.887 IST [6784] ERROR:  incorrect binary data
> format in logical replication column 1
> 2021-08-10 15:16:39.887 IST [6784] CONTEXT:  processing remote data
> for replication target relation "public.test1" column "id"
>         during apply of "INSERT" for relation "public.test1" in
> transaction with xid 740 committs 2021-08-10 14:44:38.058174+05:30

Yes, but we cannot change the error context message depending on other
error context messages. So it seems hard to construct a complete
sentence in the context message that is okay in terms of English
grammar. Is the following message better?

CONTEXT:  processing remote data for replication target relation
"public.test1" column “id"
         applying "INSERT" for relation "public.test1” in transaction
with xid 740 committs 2021-08-10 14:44:38.058174+05:30

>
> 2.
> I think we can slightly change the new context information as below:
> Before
> during apply of "INSERT" for relation "public.test1" in transaction
> with xid 740 committs 2021-08-10 14:44:38.058174+05:30
> After
> during apply of "INSERT" for relation "public.test1" in transaction id
> 740 with commit timestamp 2021-08-10 14:44:38.058174+05:30

Fixed.

>
> 3.
> +/* Struct for saving and restoring apply information */
> +typedef struct ApplyErrCallbackArg
> +{
> + LogicalRepMsgType command; /* 0 if invalid */
> +
> + /* Local relation information */
> + char    *nspname;
> + char    *relname;
>
> ...
> ...
>
> +
> +static ApplyErrCallbackArg apply_error_callback_arg =
> +{
> + .command = 0,
> + .relname = NULL,
> + .nspname = NULL,
>
> Let's initialize the struct members in the order they are declared.
> The order of relname and nspname should be another way.

Fixed.

> 4.
> +
> + TransactionId remote_xid;
> + TimestampTz committs;
> +} ApplyErrCallbackArg;
>
> It might be better to add a comment like "remote xact information"
> above these structure members.

Fixed.

>
> 5.
> +static void
> +apply_error_callback(void *arg)
> +{
> + StringInfoData buf;
> +
> + if (apply_error_callback_arg.command == 0)
> + return;
> +
> + initStringInfo(&buf);
>
> At the end of this call, it is better to free this (pfree(buf.data))

Fixed.

>
> 6. In the commit message, you might want to indicate that this
> additional information can be used by the future patch to skip the
> conflicting transaction.

Fixed.

I've attached the new patches. Please review them.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Wed, Aug 11, 2021 at 5:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Aug 11, 2021 at 11:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Aug 10, 2021 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > ==============================================================
> > >
> > > 1. While applying DML operations, we are setting up the error context
> > > multiple times due to which the context information is not
> > > appropriate. The first is set in apply_dispatch and then during
> > > processing, we set another error callback slot_store_error_callback in
> > > slot_store_data and slot_modify_data. When I forced one of the errors
> > > in slot_store_data(), it displays the below information in CONTEXT
> > > which doesn't make much sense.
> > >
> > > 2021-08-10 15:16:39.887 IST [6784] ERROR:  incorrect binary data
> > > format in logical replication column 1
> > > 2021-08-10 15:16:39.887 IST [6784] CONTEXT:  processing remote data
> > > for replication target relation "public.test1" column "id"
> > >         during apply of "INSERT" for relation "public.test1" in
> > > transaction with xid 740 committs 2021-08-10 14:44:38.058174+05:30
> >
> > Yes, but we cannot change the error context message depending on other
> > error context messages. So it seems hard to construct a complete
> > sentence in the context message that is okay in terms of English
> > grammar. Is the following message better?
> >
> > CONTEXT:  processing remote data for replication target relation
> > "public.test1" column “id"
> >          applying "INSERT" for relation "public.test1” in transaction
> > with xid 740 committs 2021-08-10 14:44:38.058174+05:30
> >
>
> I don't like the proposed text. How about if we combine both and have
> something like: "processing remote data during "UPDATE" for
> replication target relation "public.test1" column "id" in transaction
> id 740 with commit timestamp 2021-08-10 14:44:38.058174+05:30"? For
> this, I think we need to remove slot_store_error_callback and
> add/change the ApplyErrCallbackArg to include the additional required
> information in that callback.

Oh, I've never thought about that. That's a good idea.

I've attached the updated patches. FYI I've included the patch
(v8-0005) that fixes the assertion failure during shared fileset
cleanup to make cfbot tests happy.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Wed, Aug 18, 2021 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Aug 18, 2021 at 6:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > > It's right that we use "STREAM STOP" rather than "STREAM END" in many
> > > > places such as elog messages, a callback name, and source code
> > > > comments. As far as I have found there are two places where we’re
> > > > using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
> > > > doc/src/sgml/protocol.sgml. Isn't it better to fix these
> > > > inconsistencies in the first place? I think “STREAM STOP” would be
> > > > more appropriate.
> > > >
> > >
> > > I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
> > > seems to be a bit better because of the value 'E' we use for it.
> >
> > But I think we don't care about the actual value of
> > LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
> > 'E'?
> >
>
> True, but here we are trying to be consistent with other enum values
> where we try to use the first letter of the last word (which is E in
> this case). I can see there are other cases where we are not
> consistent so it won't be a big deal if we won't be consistent here. I
> am neutral on this one, so, if you feel using STREAM_STOP would be
> better from a code readability perspective then that is fine.

In addition of a code readability, there is a description in the doc
that mentions "Stream End" but we describe "Stream Stop" in the later
description, which seems a bug in the doc to me:

The following messages (Stream Start, Stream End, Stream Commit, and
Stream Abort) are available since protocol version 2.

</para>

(snip)

<varlistentry>
<term>
Stream Stop
</term>
<listitem>

Perhaps it's better to hear other opinions too, but I've attached the
patch. Please review it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

0001-Rename-LOGICAL_REP_MSG_STREAM_END-to-LOGICAL_REP_MSG.patch

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

18 августа 2021 г., 09:14:57

On Wed, Aug 18, 2021 at 10:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Aug 18, 2021 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Aug 18, 2021 at 6:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > > It's right that we use "STREAM STOP" rather than "STREAM END" in many
> > > > > places such as elog messages, a callback name, and source code
> > > > > comments. As far as I have found there are two places where we’re
> > > > > using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
> > > > > doc/src/sgml/protocol.sgml. Isn't it better to fix these
> > > > > inconsistencies in the first place? I think “STREAM STOP” would be
> > > > > more appropriate.
> > > > >
> > > >
> > > > I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
> > > > seems to be a bit better because of the value 'E' we use for it.
> > >
> > > But I think we don't care about the actual value of
> > > LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
> > > 'E'?
> > >
> >
> > True, but here we are trying to be consistent with other enum values
> > where we try to use the first letter of the last word (which is E in
> > this case). I can see there are other cases where we are not
> > consistent so it won't be a big deal if we won't be consistent here. I
> > am neutral on this one, so, if you feel using STREAM_STOP would be
> > better from a code readability perspective then that is fine.
>
> In addition of a code readability, there is a description in the doc
> that mentions "Stream End" but we describe "Stream Stop" in the later
> description, which seems a bug in the doc to me:
>

Doc changes looks good to me. But, I have question for code change:

--- a/src/include/replication/logicalproto.h
+++ b/src/include/replication/logicalproto.h
@@ -65,7 +65,7 @@ typedef enum LogicalRepMsgType
  LOGICAL_REP_MSG_COMMIT_PREPARED = 'K',
  LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r',
  LOGICAL_REP_MSG_STREAM_START = 'S',
- LOGICAL_REP_MSG_STREAM_END = 'E',
+ LOGICAL_REP_MSG_STREAM_STOP = 'E',
  LOGICAL_REP_MSG_STREAM_COMMIT = 'c',

As this is changing the enum name and if any extension (logical
replication extension) has started using it then they would require a
change. As this is the latest change in PG-14, so it might be okay but
OTOH, as this is just a code readability change, shall we do it only
for PG-15?

--
With Regards,
Amit Kapila.

RE: Skipping logical replication transactions on subscriber side

От

"houzj.fnst@fujitsu.com"

Дата:

18 августа 2021 г., 09:33:15

On Tues, Aug 17, 2021 1:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Mon, Aug 16, 2021 at 3:59 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> > 3)
> > Do we need to invoke set_apply_error_context_xact() in the function
> > apply_handle_stream_prepare() to save the xid and timestamp ?
> 
> Yes. I think that v8-0001 patch already set xid and timestamp just after parsing
> stream_prepare message. You meant it's not necessary?

Sorry, I thought of something wrong, please ignore the above comment.

> 
> I'll submit the updated patches soon.

I was thinking about the place to set the errcallback.callback.

apply_dispatch(StringInfo s)
 {
     LogicalRepMsgType action = pq_getmsgbyte(s);
+    ErrorContextCallback errcallback;
+    bool        set_callback = false;
+
+    /*
+     * Push apply error context callback if not yet. Other fields will be
+     * filled during applying the change.  Since this function can be called
+     * recursively when applying spooled changes, we set the callback only
+     * once.
+     */
+    if (apply_error_callback_arg.command == 0)
+    {
+        errcallback.callback = apply_error_callback;
+        errcallback.previous = error_context_stack;
+        error_context_stack = &errcallback;
+        set_callback = true;
+    }
...
+    /* Pop the error context stack */
+    if (set_callback)
+        error_context_stack = errcallback.previous;

It seems we can put the above code in the function LogicalRepApplyLoop()
around invoking apply_dispatch(), and in that approach we don't need to worry
about the recursively case. What do you think ?

Best regards,
Hou zj

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 августа 2021 г., 09:41:27

On Wed, Aug 18, 2021 at 3:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Aug 18, 2021 at 10:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Aug 18, 2021 at 12:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Aug 18, 2021 at 6:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Tue, Aug 17, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > > It's right that we use "STREAM STOP" rather than "STREAM END" in many
> > > > > > places such as elog messages, a callback name, and source code
> > > > > > comments. As far as I have found there are two places where we’re
> > > > > > using "STREAM STOP": LOGICAL_REP_MSG_STREAM_END and a description in
> > > > > > doc/src/sgml/protocol.sgml. Isn't it better to fix these
> > > > > > inconsistencies in the first place? I think “STREAM STOP” would be
> > > > > > more appropriate.
> > > > > >
> > > > >
> > > > > I think keeping STREAM_END in the enum 'LOGICAL_REP_MSG_STREAM_END'
> > > > > seems to be a bit better because of the value 'E' we use for it.
> > > >
> > > > But I think we don't care about the actual value of
> > > > LOGICAL_REP_MSG_STREAM_END since we use the enum value rather than
> > > > 'E'?
> > > >
> > >
> > > True, but here we are trying to be consistent with other enum values
> > > where we try to use the first letter of the last word (which is E in
> > > this case). I can see there are other cases where we are not
> > > consistent so it won't be a big deal if we won't be consistent here. I
> > > am neutral on this one, so, if you feel using STREAM_STOP would be
> > > better from a code readability perspective then that is fine.
> >
> > In addition of a code readability, there is a description in the doc
> > that mentions "Stream End" but we describe "Stream Stop" in the later
> > description, which seems a bug in the doc to me:
> >
>
> Doc changes looks good to me. But, I have question for code change:
>
> --- a/src/include/replication/logicalproto.h
> +++ b/src/include/replication/logicalproto.h
> @@ -65,7 +65,7 @@ typedef enum LogicalRepMsgType
>   LOGICAL_REP_MSG_COMMIT_PREPARED = 'K',
>   LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r',
>   LOGICAL_REP_MSG_STREAM_START = 'S',
> - LOGICAL_REP_MSG_STREAM_END = 'E',
> + LOGICAL_REP_MSG_STREAM_STOP = 'E',
>   LOGICAL_REP_MSG_STREAM_COMMIT = 'c',
>
> As this is changing the enum name and if any extension (logical
> replication extension) has started using it then they would require a
> change. As this is the latest change in PG-14, so it might be okay but
> OTOH, as this is just a code readability change, shall we do it only
> for PG-15?

I think that the doc changes could be backpatched to PG14 but I think
we should do the code change only for PG15.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: Skipping logical replication transactions on subscriber side

От

"houzj.fnst@fujitsu.com"

Дата:

18 августа 2021 г., 10:19:07

On Wed, Aug 18, 2021 2:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> On Wed, Aug 18, 2021 at 3:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Aug 18, 2021 at 10:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > In addition of a code readability, there is a description in the doc
> > > that mentions "Stream End" but we describe "Stream Stop" in the
> > > later description, which seems a bug in the doc to me:
> > >
> >
> > Doc changes looks good to me. But, I have question for code change:
> >
> > --- a/src/include/replication/logicalproto.h
> > +++ b/src/include/replication/logicalproto.h
> > @@ -65,7 +65,7 @@ typedef enum LogicalRepMsgType
> >   LOGICAL_REP_MSG_COMMIT_PREPARED = 'K',
> >   LOGICAL_REP_MSG_ROLLBACK_PREPARED = 'r',
> >   LOGICAL_REP_MSG_STREAM_START = 'S',
> > - LOGICAL_REP_MSG_STREAM_END = 'E',
> > + LOGICAL_REP_MSG_STREAM_STOP = 'E',
> >   LOGICAL_REP_MSG_STREAM_COMMIT = 'c',
> >
> > As this is changing the enum name and if any extension (logical
> > replication extension) has started using it then they would require a
> > change. As this is the latest change in PG-14, so it might be okay but
> > OTOH, as this is just a code readability change, shall we do it only
> > for PG-15?
> 
> I think that the doc changes could be backpatched to PG14 but I think we
> should do the code change only for PG15.

+1, and the patch looks good to me.

Best regards,
Hou zj

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 августа 2021 г., 11:39:08

On Wed, Aug 18, 2021 at 3:33 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tues, Aug 17, 2021 1:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Mon, Aug 16, 2021 at 3:59 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> > > 3)
> > > Do we need to invoke set_apply_error_context_xact() in the function
> > > apply_handle_stream_prepare() to save the xid and timestamp ?
> >
> > Yes. I think that v8-0001 patch already set xid and timestamp just after parsing
> > stream_prepare message. You meant it's not necessary?
>
> Sorry, I thought of something wrong, please ignore the above comment.
>
> >
> > I'll submit the updated patches soon.
>
> I was thinking about the place to set the errcallback.callback.
>
> apply_dispatch(StringInfo s)
>  {
>         LogicalRepMsgType action = pq_getmsgbyte(s);
> +       ErrorContextCallback errcallback;
> +       bool            set_callback = false;
> +
> +       /*
> +        * Push apply error context callback if not yet. Other fields will be
> +        * filled during applying the change.  Since this function can be called
> +        * recursively when applying spooled changes, we set the callback only
> +        * once.
> +        */
> +       if (apply_error_callback_arg.command == 0)
> +       {
> +               errcallback.callback = apply_error_callback;
> +               errcallback.previous = error_context_stack;
> +               error_context_stack = &errcallback;
> +               set_callback = true;
> +       }
> ...
> +       /* Pop the error context stack */
> +       if (set_callback)
> +               error_context_stack = errcallback.previous;
>
> It seems we can put the above code in the function LogicalRepApplyLoop()
> around invoking apply_dispatch(), and in that approach we don't need to worry
> about the recursively case. What do you think ?

Thank you for the comment!

I think you're right. Maybe we can set the callback before entering to
the main loop and pop it after breaking from it. It would also fix the
problem reported by Tang[1]. But one thing we need to note that since
we want to reset apply_error_callback_arg.command at the end of
apply_dispatch() (otherwise we could end up setting the apply error
context to an irrelevant error such as network error), when
apply_dispatch() is called recursively probably we need to save the
apply_error_callback_arg.command before setting the new command and
then revert back to the saved command. Is that right?

Regards,

[1]
https://www.postgresql.org/message-id/OS0PR01MB6113E5BC24922A2D05D16051FBFE9%40OS0PR01MB6113.jpnprd01.prod.outlook.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

19 августа 2021 г., 04:47:30

On Wed, Aug 18, 2021 at 5:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Aug 18, 2021 at 3:33 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tues, Aug 17, 2021 1:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > On Mon, Aug 16, 2021 at 3:59 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> > > > 3)
> > > > Do we need to invoke set_apply_error_context_xact() in the function
> > > > apply_handle_stream_prepare() to save the xid and timestamp ?
> > >
> > > Yes. I think that v8-0001 patch already set xid and timestamp just after parsing
> > > stream_prepare message. You meant it's not necessary?
> >
> > Sorry, I thought of something wrong, please ignore the above comment.
> >
> > >
> > > I'll submit the updated patches soon.
> >
> > I was thinking about the place to set the errcallback.callback.
> >
> > apply_dispatch(StringInfo s)
> >  {
> >         LogicalRepMsgType action = pq_getmsgbyte(s);
> > +       ErrorContextCallback errcallback;
> > +       bool            set_callback = false;
> > +
> > +       /*
> > +        * Push apply error context callback if not yet. Other fields will be
> > +        * filled during applying the change.  Since this function can be called
> > +        * recursively when applying spooled changes, we set the callback only
> > +        * once.
> > +        */
> > +       if (apply_error_callback_arg.command == 0)
> > +       {
> > +               errcallback.callback = apply_error_callback;
> > +               errcallback.previous = error_context_stack;
> > +               error_context_stack = &errcallback;
> > +               set_callback = true;
> > +       }
> > ...
> > +       /* Pop the error context stack */
> > +       if (set_callback)
> > +               error_context_stack = errcallback.previous;
> >
> > It seems we can put the above code in the function LogicalRepApplyLoop()
> > around invoking apply_dispatch(), and in that approach we don't need to worry
> > about the recursively case. What do you think ?
>
> Thank you for the comment!
>
> I think you're right. Maybe we can set the callback before entering to
> the main loop and pop it after breaking from it. It would also fix the
> problem reported by Tang[1]. But one thing we need to note that since
> we want to reset apply_error_callback_arg.command at the end of
> apply_dispatch() (otherwise we could end up setting the apply error
> context to an irrelevant error such as network error), when
> apply_dispatch() is called recursively probably we need to save the
> apply_error_callback_arg.command before setting the new command and
> then revert back to the saved command. Is that right?

I've attached the updated version patches that incorporated all
comments I got so far unless I'm missing something. Please review
them.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Thu, Aug 19, 2021 at 10:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Aug 19, 2021 at 9:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Aug 19, 2021 at 12:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Aug 17, 2021 at 12:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > >
> > >
> > > > Another comment on the 0001 patch: as there is now a mix of setting
> > > > "apply_error_callback_arg" members directly and also through inline
> > > > functions, it might look better to have it done consistently with
> > > > functions having prototypes something like the following:
> > > >
> > > > static inline void set_apply_error_context_rel(LogicalRepRelMapEntry *rel);
> > > > static inline void reset_apply_error_context_rel(void);
> > > > static inline void set_apply_error_context_attnum(int remote_attnum);
> > >
> > > It might look consistent, but if we do that, we will end up needing
> > > functions every field to update when we add new fields to the struct
> > > in the future?
> > >
> >
> > Yeah, I also think it is too much, but we can add comments where ever
> > we set the information for error callback. I see it is missing when
> > the patch is setting remote_attnum, see similar other places and add
> > comments if already not there.
>
> Agred. Will add comments in the next version patch.

I've attached updated patches. Please review them.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Tue, Aug 24, 2021 at 10:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 24, 2021 at 11:44 AM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > On Monday, August 23, 2021 11:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached updated patches. Please review them.
> > >
> >
> > I tested v10-0001 patch in both streaming and no-streaming more. All tests works well.
> >
> > I also tried two-phase commit feature, the error context was set as expected,
> > but please allow me to propose a fix suggestion on the error description:

Thank you for the suggestion!

> >
> > CONTEXT:  processing remote data during "INSERT" for replication target relation
> > "public.test" in transaction 714 with commit timestamp 2021-08-24
> > 13:20:22.480532+08
> >
> > It said "commit timestamp", but for 2pc feature, the timestamp could be "prepare timestamp" or "rollback
timestamp",too.
 
> > Could we make some change to make the error log more comprehensive?
> >
>
> I think we can write something like: (processing remote data during
> "INSERT" for replication target relation "public.test" in transaction
> 714 at 2021-08-24 13:20:22.480532+08). Basically replacing "with
> commit timestamp" with "at". This is similar to what we do
> test_decoding module for transaction timestamp.

+1

> The other idea could
> be we print the exact operation like commit/prepare/rollback which is
> also possible because we have that information while setting context
> info but that might add a bit more complexity which I don't think is
> worth it.

Agreed.

I replaced "with commit timestamp" with "at" and rename 'commit_ts'
field name to 'ts'.

>
> One more point about the  v10-0001* patch: From the commit message
> "Add logical changes details to errcontext of apply worker errors.",
> it appears that the context will be added only for the apply worker
> but won't it get added for tablesync worker as well during its sync
> phase (when it tries to catch up with apply worker)?

Right. I've updated the message.

Attached updated version patches. Please review them.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Thu, Aug 26, 2021 at 4:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Aug 26, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > 1.
> > + if (errarg->rel)
> > + appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""),
> > + errarg->rel->remoterel.nspname,
> > + errarg->rel->remoterel.relname);
> > +
> > + if (errarg->remote_attnum >= 0)
> > + appendStringInfo(&buf, _(" column \"%s\""),
> > + errarg->rel->remoterel.attnames[errarg->remote_attnum]);
> >
> > Isn't it better if 'remote_attnum' check is inside if (errargrel)
> > check? It will be weird to print column information without rel
> > information and in the current code, we don't set remote_attnum
> > without rel. The other possibility could be to have an Assert for rel
> > in 'remote_attnum' check.
>
> Agreed to check 'remote_attnum' inside "if(errargrel)".
>

Okay, changed accordingly. Additionally, I have changed the code which
sets timestamp to (unset) when it is 0 so that it won't display the
timestamp in that case. I have made few other cosmetic changes in the
attached patch. See and let me know what you think of it?

Note - I have just attached the first patch here, once this is
committed we can focus on others.

-- 
With Regards,
Amit Kapila.

Вложения

v12-0001-Add-logical-change-details-to-logical-replicatio.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

26 августа 2021 г., 15:53:46

On Thu, Aug 26, 2021 at 9:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Aug 26, 2021 at 4:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Aug 26, 2021 at 3:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > 1.
> > > + if (errarg->rel)
> > > + appendStringInfo(&buf, _(" for replication target relation \"%s.%s\""),
> > > + errarg->rel->remoterel.nspname,
> > > + errarg->rel->remoterel.relname);
> > > +
> > > + if (errarg->remote_attnum >= 0)
> > > + appendStringInfo(&buf, _(" column \"%s\""),
> > > + errarg->rel->remoterel.attnames[errarg->remote_attnum]);
> > >
> > > Isn't it better if 'remote_attnum' check is inside if (errargrel)
> > > check? It will be weird to print column information without rel
> > > information and in the current code, we don't set remote_attnum
> > > without rel. The other possibility could be to have an Assert for rel
> > > in 'remote_attnum' check.
> >
> > Agreed to check 'remote_attnum' inside "if(errargrel)".
> >
>
> Okay, changed accordingly. Additionally, I have changed the code which
> sets timestamp to (unset) when it is 0 so that it won't display the
> timestamp in that case. I have made few other cosmetic changes in the
> attached patch. See and let me know what you think of it?

Thank you for the patch!

Agreed with these changes. The patch looks good to me.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

27 августа 2021 г., 07:36:48

On Thu, Aug 26, 2021 at 6:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Aug 26, 2021 at 9:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Okay, changed accordingly. Additionally, I have changed the code which
> > sets timestamp to (unset) when it is 0 so that it won't display the
> > timestamp in that case. I have made few other cosmetic changes in the
> > attached patch. See and let me know what you think of it?
>
> Thank you for the patch!
>
> Agreed with these changes. The patch looks good to me.
>

Pushed, feel free to rebase and send the remaining patch set.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

27 августа 2021 г., 14:03:05

On Fri, Aug 27, 2021 at 1:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Aug 26, 2021 at 6:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Aug 26, 2021 at 9:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Okay, changed accordingly. Additionally, I have changed the code which
> > > sets timestamp to (unset) when it is 0 so that it won't display the
> > > timestamp in that case. I have made few other cosmetic changes in the
> > > attached patch. See and let me know what you think of it?
> >
> > Thank you for the patch!
> >
> > Agreed with these changes. The patch looks good to me.
> >
>
> Pushed, feel free to rebase and send the remaining patch set.

Thanks!

I'll post the updated version patch on Monday.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

30 августа 2021 г., 10:06:55

On Fri, Aug 27, 2021 at 8:03 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Aug 27, 2021 at 1:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Aug 26, 2021 at 6:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Aug 26, 2021 at 9:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Okay, changed accordingly. Additionally, I have changed the code which
> > > > sets timestamp to (unset) when it is 0 so that it won't display the
> > > > timestamp in that case. I have made few other cosmetic changes in the
> > > > attached patch. See and let me know what you think of it?
> > >
> > > Thank you for the patch!
> > >
> > > Agreed with these changes. The patch looks good to me.
> > >
> >
> > Pushed, feel free to rebase and send the remaining patch set.
>
> Thanks!
>
> I'll post the updated version patch on Monday.

I've attached rebased patches. 0004 patch is not the scope of this
patch. It's borrowed from another thread[1] to fix the assertion
failure for newly added tests. Please review them.

Regards,

[1] https://www.postgresql.org/message-id/CAFiTN-v-zFpmm7Ze1Dai5LJjhhNYXvK8QABs35X89WY0HDG4Ww%40mail.gmail.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

From Mon, Aug 30, 2021 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've attached rebased patches. 0004 patch is not the scope of this 
> patch. It's borrowed from another thread[1] to fix the assertion 
> failure for newly added tests. Please review them.

Hi,

I reviewed the 0002 patch and have a suggestion for it.

+                if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
+                {
+                    values[Anum_pg_subscription_subsynccommit - 1] =
+                        CStringGetTextDatum("off");
+                    replaces[Anum_pg_subscription_subsynccommit - 1] = true;
+                }

Currently, the patch set the default value out of parse_subscription_options(),
but I think It might be more standard to set the value in
parse_subscription_options(). Like:

            if (!is_reset)
            {
                ...
+            }
+            else
+                opts->synchronous_commit = "off";

And then, we can set the value like:

                    values[Anum_pg_subscription_subsynccommit - 1] =
                        CStringGetTextDatum(opts.synchronous_commit);

Besides, instead of adding a switch case like the following:
+        case ALTER_SUBSCRIPTION_RESET_OPTIONS:
+            {

We can add a bool flag(isReset) in AlterSubscriptionStmt and check the flag
when invoking parse_subscription_options(). In this approach, the code can be
shorter.

Attach a diff file based on the v12-0002 which change the code like the above
suggestion.

Best regards,
Hou zj

Вложения

0001-diff-for-0002_patch

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

02 сентября 2021 г., 15:03:36

On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached rebased patches. 0004 patch is not the scope of this
> patch. It's borrowed from another thread[1] to fix the assertion
> failure for newly added tests. Please review them.
>

Some initial comments for the v12-0003 patch:

(1) Patch comment
"This commit introduces another way to skip the transaction in question."

I think it should further explain: "This commit introduces another way
to skip the transaction in question, other than manually updating the
subscriber's database or using pg_replication_origin_advance()."


doc/src/sgml/logical-replication.sgml
(2)

Suggested minor update:

BEFORE:
+   transaction that conflicts with the existing data.  When a conflict produce
+   an error, it is shown in
<structname>pg_stat_subscription_errors</structname>
+   view as follows:
AFTER:
+   transaction that conflicts with the existing data.  When a conflict produces
+   an error, it is recorded in the
<structname>pg_stat_subscription_errors</structname>
+   view as follows:

(3)
+   found from those outputs (transaction ID 740 in the above case).
The transaction

Shouldn't it be transaction ID 716?

(4)
+   can be skipped by setting <replaceable>skip_xid</replaceable> to
the subscription

Is it better to say here ... "on the subscription" ?

(5)
Just skipping a transaction could make a subscriber inconsistent, right?

Would it be better as follows?

BEFORE:
+   In either way, those should be used as a last resort. They skip the whole
+   transaction including changes that may not violate any constraint and easily
+   make subscriber inconsistent if a user specifies the wrong transaction ID or
+   the position of origin.

AFTER:
+   Either way, those transaction skipping methods should be used as a
last resort.
+   They skip the whole transaction, including changes that may not violate any
+   constraint.  They may easily make the subscriber inconsistent,
especially if a
+   user specifies the wrong transaction ID or the position of origin.

(6)
The grammar is not great in the following description, so here's a
suggested improvement:

BEFORE:
+          incoming change or by skipping the whole transaction.  This option
+          specifies transaction ID that logical replication worker skips to
+          apply.  The logical replication worker skips all data modification

AFTER:
+          incoming changes or by skipping the whole transaction.  This option
+          specifies the ID of the transaction whose application is to
be skipped
+          by the logical replication worker.  The logical replication worker
+          skips all data modification


src/backend/postmaster/pgstat.c
(7)
BEFORE:
+ * Tell the collector about clear the error of subscription.
AFTER:
+ * Tell the collector to clear the subscription error.


src/backend/replication/logical/worker.c
(8)
+ * subscription is invalidated and* MySubscription->skipxid gets
changed or reset.

There is a "*" after "and".

(9)
Do these lines really need to be moved up?

+ /* extract XID of the top-level transaction */
+ stream_xid = logicalrep_read_stream_start(s, &first_segment);
+

src/backend/postmaster/pgstat.c
(10)

+ bool m_clear; /* clear all fields except for last_failure and
+ * last_errmsg */

I think it should say: clear all fields except for last_failure,
last_errmsg and stat_reset_timestamp.


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Mark Dilger

Дата:

02 сентября 2021 г., 22:33:52

> On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached rebased patches.

Thanks for these patches, Sawada-san!

The first patch in your series, v12-0001, seems useful to me even before committing any of the rest.  I would like to
integratethe new pg_stat_subscription_errors view it creates into regression tests for other logical replication
featuresunder development. 

In particular, it can be hard to write TAP tests that need to wait for subscriptions to catch up or fail.  With your
viewcommitted, a new PostgresNode function to wait for catchup or for failure can be added, and then developers of
differentprojects can all use that.  I am attaching a version of such a function, plus some tests of your patch (since
itdoes not appear to have any).  Would you mind reviewing these and giving comments or including them in your next
patchversion? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

0001-Adding-tests-of-subscription-errors.patch

Re: Skipping logical replication transactions on subscriber side

От

Mark Dilger

Дата:

02 сентября 2021 г., 23:44:58

> On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached rebased patches.

Here are some review comments:

For the v12-0002 patch:

The documentation changes for ALTER SUBSCRIPTION .. RESET look strange to me. You grouped SET and RESET together, much
likesql-altertable.html has them grouped, but I don't think it flows naturally here, as the two commands do not support
thesame set of parameters. It might look better if you documented these separately. It might also be good to order
theparameters the same, so that the differences can more quickly be seen.

For the v12-0003 patch:

I believe this feature is needed, but it also seems like a very powerful foot-gun. Can we do anything to make it less
likelythat users will hurt themselves with this tool?

I am thinking back to support calls I have attended. When a production system is down, there is often some hesitancy
toperform ad-hoc operations on the database, but once the decision has been made to do so, people try to get the whole
processdone as quickly as possible. If multiple transactions on the publisher fail on the subscriber, they will do so
inseries, not in parallel. The process of clearing these errors will amount to copying the xid of each failed
transactionto the ALTER SUBSCRIPTION ... SET (skip_xid = xxx) command and running it, then the next, then the next,
.... Perhaps the first couple times through the process, the customer will look to see that the failure is of the same
typeand on the same table, but after a short time they will likely just script something to clear the rest as quickly
aspossible. In the heat of the moment, they may not include a check of the failure message, but merely a grep of the
failingxid.

If the user could instead clear all failed transactions of the same type, that might make it less likely that they
unthinkinglyalso skip subsequent errors of some different type. Perhaps something like ALTER SUBSCRIPTION ... SET
(skip_failures= 'duplicate key value violates unique constraint "test_pkey"')? This is arguably a different feature
request,and not something your patch is required to address, but I wonder how much we should limit people shooting
themselvesin the foot? If we built something like this using your skip_xid feature, rather than instead of your
skip_xidfeature, would your feature need to be modified?

The docs could use some rewording, too:

+ If incoming data violates any constraints the logical replication
+ will stop until it is resolved.

In my experience, logical replication doesn't stop, but instead goes into an infinite loop of retries.

+ The resolution can be done either
+ by changing data on the subscriber so that it doesn't conflict with
+ incoming change or by skipping the whole transaction.

I'm having trouble thinking of an example conflict where skipping a transaction would be better than writing a BEFORE
INSERTtrigger on the conflicting table which suppresses or redirects conflicting rows somewhere else. Particularly for
largertransactions containing multiple statements, suppressing the conflicting rows using a trigger would be less messy
thanskipping the transaction. I think your patch adds a useful tool to the toolkit, but maybe we should mention more
alternativesin the docs? Something like, "changing the data on the subscriber so that it doesn't conflict with
incomingchanges, or dropping the conflicting constraint or unique index, or writing a trigger on the subscriber to
suppressor redirect conflicting incoming changes, or as a last resort, by skipping the whole transaction"?

Perhaps I'm reading your phrase "changing the data on the subscriber" too narrowly. To me, that means running DML
(eithera DELETE or an UPDATE) on the existing data in the table where the conflict arises. These other options are DDL
anddo not easily come to mind when I read that phrase.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

03 сентября 2021 г., 09:46:07

On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached rebased patches. 0004 patch is not the scope of this
> patch. It's borrowed from another thread[1] to fix the assertion
> failure for newly added tests. Please review them.
>

BTW, these patches need rebasing (broken by recent commits, patches
0001, 0003 and 0004 no longer apply, and it's failing in the cfbot).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

04 сентября 2021 г., 06:24:32

On Fri, Sep 3, 2021 at 2:15 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
>
> > On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached rebased patches.
> For the v12-0003 patch:
>
> I believe this feature is needed, but it also seems like a very powerful foot-gun.  Can we do anything to make it
lesslikely that users will hurt themselves with this tool? 
>

This won't do any more harm than currently, users can do via
pg_replication_slot_advance and the same is documented as well, see
[1]. This will be allowed to only superusers. Its effect will be
documented with a precautionary note to use it only when the other
available ways can't be used. Any better ideas?

> I am thinking back to support calls I have attended.  When a production system is down, there is often some hesitancy
toperform ad-hoc operations on the database, but once the decision has been made to do so, people try to get the whole
processdone as quickly as possible.  If multiple transactions on the publisher fail on the subscriber, they will do so
inseries, not in parallel. 
>

The subscriber will know only one transaction failure at a time, till
that is resolved, the apply won't move ahead and it won't know even if
there are other transactions that are going to fail in the future.

>
> If the user could instead clear all failed transactions of the same type, that might make it less likely that they
unthinkinglyalso skip subsequent errors of some different type.  Perhaps something like ALTER SUBSCRIPTION ... SET
(skip_failures= 'duplicate key value violates unique constraint "test_pkey"')? 
>

I think if we want we can allow to skip particular error via
skip_error_code instead of via error message but not sure if it would
be better to skip a particular operation of the transaction rather
than the entire transaction. Normally from the atomicity purpose the
transaction can be either committed or rolled-back but not partially
done so I think it would be preferable to skip the entire transaction
rather than skipping it partially.

>  This is arguably a different feature request, and not something your patch is required to address, but I wonder how
muchwe should limit people shooting themselves in the foot?  If we built something like this using your skip_xid
feature,rather than instead of your skip_xid feature, would your feature need to be modified? 
>

Sawada-San can answer better but I don't see any problem building any
such feature on top of what is currently proposed.

>
> I'm having trouble thinking of an example conflict where skipping a transaction would be better than writing a BEFORE
INSERTtrigger on the conflicting table which suppresses or redirects conflicting rows somewhere else.  Particularly for
largertransactions containing multiple statements, suppressing the conflicting rows using a trigger would be less messy
thanskipping the transaction.  I think your patch adds a useful tool to the toolkit, but maybe we should mention more
alternativesin the docs?  Something like, "changing the data on the subscriber so that it doesn't conflict with
incomingchanges, or dropping the conflicting constraint or unique index, or writing a trigger on the subscriber to
suppressor redirect conflicting incoming changes, or as a last resort, by skipping the whole transaction"? 
>

+1 for extending the docs as per this suggestion.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

04 сентября 2021 г., 06:38:15

On Sat, Sep 4, 2021 at 8:54 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Sep 3, 2021 at 2:15 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> >
> > > On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached rebased patches.
> > For the v12-0003 patch:
> >
> > I believe this feature is needed, but it also seems like a very powerful foot-gun.  Can we do anything to make it
lesslikely that users will hurt themselves with this tool?
 
> >
>
> This won't do any more harm than currently, users can do via
> pg_replication_slot_advance and the same is documented as well, see
> [1].
>

Sorry, forgot to give the link.

[1] - https://www.postgresql.org/docs/devel/logical-replication-conflicts.html

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 сентября 2021 г., 16:41:20

On Thu, Sep 2, 2021 at 12:06 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> >
> > I've attached rebased patches. 0004 patch is not the scope of this
> > patch. It's borrowed from another thread[1] to fix the assertion
> > failure for newly added tests. Please review them.
> >
>
> I have some initial feedback on the v12-0001 patch.
> Most of these are suggested improvements to wording, and some typo fixes.

Thank you for the comments!

>
>
> (0) Patch comment
>
> Suggestion to improve the patch comment:
>
> BEFORE:
> Add pg_stat_subscription_errors statistics view.
>
> This commits adds new system view pg_stat_logical_replication_error,

Oops, I realized that it should be pg_stat_subscription_errors.

> showing errors happening during applying logical replication changes
> as well as during performing initial table synchronization.
>
> The subscription error entries are removed by autovacuum workers when
> the table synchronization competed in table sync worker cases and when
> dropping the subscription in apply worker cases.
>
> It also adds SQL function pg_stat_reset_subscription_error() to
> reset the single subscription error.
>
> AFTER:
> Add a subscription errors statistics view "pg_stat_subscription_errors".
>
> This commits adds a new system view pg_stat_logical_replication_errors,
> that records information about any errors which occur during application
> of logical replication changes as well as during performing initial table
> synchronization.

I think that views don't have any data so "show information" seems
appropriate to me here. Thoughts?

>
> The subscription error entries are removed by autovacuum workers after
> table synchronization completes in table sync worker cases and after
> dropping the subscription in apply worker cases.
>
> It also adds an SQL function pg_stat_reset_subscription_error() to
> reset a single subscription error.
>
>
>
> doc/src/sgml/monitoring.sgml:
>
> (1)
> BEFORE:
> +      <entry>One row per error that happened on subscription, showing
> information about
> +       the subscription errors.
> AFTER:
> +      <entry>One row per error that occurred on subscription,
> providing information about
> +       each subscription error.

Fixed.

>
> (2)
> BEFORE:
> +   The <structname>pg_stat_subscription_errors</structname> view will
> contain one
> AFTER:
> +   The <structname>pg_stat_subscription_errors</structname> view contains one
>

I think that descriptions of other statistics view also say "XXX view
will contain ...".

>
> (3)
> BEFORE:
> +        Name of the database in which the subscription is created.
> AFTER:
> +        Name of the database in which the subscription was created.

Fixed.

>
> (4)
> BEFORE:
> +       OID of the relation that the worker is processing when the
> +       error happened.
> AFTER:
> +       OID of the relation that the worker was processing when the
> +       error occurred.
>

Fixed.

>
> (5)
> BEFORE:
> +        Name of command being applied when the error happened.  This
> +        field is always NULL if the error is reported by
> +        <literal>tablesync</literal> worker.
> AFTER:
> +        Name of command being applied when the error occurred.  This
> +        field is always NULL if the error is reported by a
> +        <literal>tablesync</literal> worker.

Fixed.

> (6)
> BEFORE:
> +        Transaction ID of publisher node being applied when the error
> +        happened.  This field is always NULL if the error is reported
> +        by <literal>tablesync</literal> worker.
> AFTER:
> +        Transaction ID of the publisher node being applied when the error
> +        happened.  This field is always NULL if the error is reported
> +        by a <literal>tablesync</literal> worker.

Fixed.

> (7)
> BEFORE:
> +        Type of worker reported the error: <literal>apply</literal> or
> +        <literal>tablesync</literal>.
> AFTER:
> +        Type of worker reporting the error: <literal>apply</literal> or
> +        <literal>tablesync</literal>.

Fixed.

>
> (8)
> BEFORE:
> +       Number of times error happened on the worker.
> AFTER:
> +       Number of times the error occurred in the worker.
>
> [or "Number of times the worker reported the error" ?]

I prefer "Number of times the error occurred in the worker."

>
> (9)
> BEFORE:
> +       Time at which the last error happened.
> AFTER:
> +       Time at which the last error occurred.

Fixed.

>
> (10)
> BEFORE:
> +       Error message which is reported last failure time.
> AFTER:
> +       Error message which was reported at the last failure time.
>
> Maybe this should just say "Last reported error message" ?

Fixed.

>
>
> (11)
> You shouldn't call hash_get_num_entries() on a NULL pointer.
>
> Suggest swappling lines, as shown below:
>
> BEFORE:
> + int32 nerrors = hash_get_num_entries(subent->suberrors);
> +
> + /* Skip this subscription if not have any errors */
> + if (subent->suberrors == NULL)
> +    continue;
> AFTER:
> + int32 nerrors;
> +
> + /* Skip this subscription if not have any errors */
> + if (subent->suberrors == NULL)
> +    continue;
> + nerrors = hash_get_num_entries(subent->suberrors);

Right. Fixed.

>
>
> (12)
> Typo:  legnth -> length
>
> + * contains the fixed-legnth error message string which is

Fixed.

>
>
> src/backend/postmaster/pgstat.c
>
> (13)
> "Subscription stat entries" hashtable is created in two different
> places, one with HASH_CONTEXT and the other without. Is this
> intentional?
> Shouldn't there be a single function for creating this?

Yes, it's intentional. It's consistent with hash tables for other statistics.

>
>
> (14)
> + * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
>
> Seems to be missing a word, is it meant to say "Sent by the autovacuum
> to purge the subscriptions." ?

Yes, fixed.

>
> (15)
> + * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
> + * errors.
>
> Seems to be missing a word, is it meant to say "Sent by the autovacuum
> to purge the subscription errors." ?

Thanks, fixed.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 сентября 2021 г., 16:41:43

On Thu, Sep 2, 2021 at 2:55 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> >
> > I've attached rebased patches. 0004 patch is not the scope of this
> > patch. It's borrowed from another thread[1] to fix the assertion
> > failure for newly added tests. Please review them.
> >
>
> I have a few comments on the v12-0002 patch:

Thank you for the comments!

>
> (1) Patch comment
>
> Has a typo and could be expressed a bit better.
>
> Suggestion:
>
> BEFORE:
> RESET command is reuiqred by follow-up commit introducing to a new
> parameter skip_xid to reset.
> AFTER:
> The RESET parameter for ALTER SUBSCRIPTION is required by the
> follow-up commit that introduces a new resettable subscription
> parameter "skip_xid".

Fixed.

>
>
> doc/src/sgml/ref/alter_subscription.sgml
>
> (2)
> I don't think "RESET" is sufficiently described in
> alter_subscription.sgml. Just putting it under "SET" and changing
> "altered" to "set" doesn't explain what resetting does. It should say
> something about setting the parameter back to its original (default)
> value.

Doesn't "RESET" normally mean to change the parameter back to its default value?

>
>
> (3)
> case ALTER_SUBSCRIPTION_RESET_OPTIONS
>
> Some comments here would be helpful e.g. Reset the specified
> parameters back to their default values.

Okay, added.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 сентября 2021 г., 16:42:25

On Thu, Sep 2, 2021 at 9:03 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached rebased patches. 0004 patch is not the scope of this
> > patch. It's borrowed from another thread[1] to fix the assertion
> > failure for newly added tests. Please review them.
> >
>

Thank you for the comments!

> Some initial comments for the v12-0003 patch:
>
> (1) Patch comment
> "This commit introduces another way to skip the transaction in question."
>
> I think it should further explain: "This commit introduces another way
> to skip the transaction in question, other than manually updating the
> subscriber's database or using pg_replication_origin_advance()."

Updated.

>
>
> doc/src/sgml/logical-replication.sgml
> (2)
>
> Suggested minor update:
>
> BEFORE:
> +   transaction that conflicts with the existing data.  When a conflict produce
> +   an error, it is shown in
> <structname>pg_stat_subscription_errors</structname>
> +   view as follows:
> AFTER:
> +   transaction that conflicts with the existing data.  When a conflict produces
> +   an error, it is recorded in the
> <structname>pg_stat_subscription_errors</structname>
> +   view as follows:

Fixed.

>
> (3)
> +   found from those outputs (transaction ID 740 in the above case).
> The transaction
>
> Shouldn't it be transaction ID 716?

Right, fixed.

>
> (4)
> +   can be skipped by setting <replaceable>skip_xid</replaceable> to
> the subscription
>
> Is it better to say here ... "on the subscription" ?

Okay, fixed.

>
> (5)
> Just skipping a transaction could make a subscriber inconsistent, right?
>
> Would it be better as follows?
>
> BEFORE:
> +   In either way, those should be used as a last resort. They skip the whole
> +   transaction including changes that may not violate any constraint and easily
> +   make subscriber inconsistent if a user specifies the wrong transaction ID or
> +   the position of origin.
>
> AFTER:
> +   Either way, those transaction skipping methods should be used as a
> last resort.
> +   They skip the whole transaction, including changes that may not violate any
> +   constraint.  They may easily make the subscriber inconsistent,
> especially if a
> +   user specifies the wrong transaction ID or the position of origin.

Agreed, fixed.

>
> (6)
> The grammar is not great in the following description, so here's a
> suggested improvement:
>
> BEFORE:
> +          incoming change or by skipping the whole transaction.  This option
> +          specifies transaction ID that logical replication worker skips to
> +          apply.  The logical replication worker skips all data modification
>
> AFTER:
> +          incoming changes or by skipping the whole transaction.  This option
> +          specifies the ID of the transaction whose application is to
> be skipped
> +          by the logical replication worker.  The logical replication worker
> +          skips all data modification

Fixed.

>
>
> src/backend/postmaster/pgstat.c
> (7)
> BEFORE:
> + * Tell the collector about clear the error of subscription.
> AFTER:
> + * Tell the collector to clear the subscription error.

Fixed.

>
>
> src/backend/replication/logical/worker.c
> (8)
> + * subscription is invalidated and* MySubscription->skipxid gets
> changed or reset.
>
> There is a "*" after "and".

Fixed.

>
> (9)
> Do these lines really need to be moved up?
>
> + /* extract XID of the top-level transaction */
> + stream_xid = logicalrep_read_stream_start(s, &first_segment);
> +

I had missed to revert this change, fixed.

>
> src/backend/postmaster/pgstat.c
> (10)
>
> + bool m_clear; /* clear all fields except for last_failure and
> + * last_errmsg */
>
> I think it should say: clear all fields except for last_failure,
> last_errmsg and stat_reset_timestamp.

Fixed.

Those comments including your comments on the v12-0001 and v12-0002
are incorporated into local branch. I'll submit the updated patches
after incorporating all other comments.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 сентября 2021 г., 16:57:28

On Thu, Sep 2, 2021 at 5:41 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Mon, Aug 30, 2021 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached rebased patches. 0004 patch is not the scope of this patch. It's
> > borrowed from another thread[1] to fix the assertion failure for newly added
> > tests. Please review them.
>
> Hi,
>
> I reviewed the v12-0001 patch, here are some comments:

Thank you for the comments!

>
> 1)
> --- a/src/backend/utils/error/elog.c
> +++ b/src/backend/utils/error/elog.c
> @@ -1441,7 +1441,6 @@ getinternalerrposition(void)
>         return edata->internalpos;
>  }
>
> -
>
> It seems a miss change in elog.c

Fixed.

>
> 2)
>
> +       TupleDescInitEntry(tupdesc, (AttrNumber) 10, "stats_reset",
> +                                          TIMESTAMPTZOID, -1, 0);
>
> The document doesn't mention the column "stats_reset".

Added.

> 3)
>
> +typedef struct PgStat_StatSubErrEntry
> +{
> +       Oid                     subrelid;               /* InvalidOid if the apply worker, otherwise
> +                                                                * the table sync worker. hash table key. */
>
> From the comments of subrelid, I think one subscription only have one apply
> worker error entry, right ? If so, I was thinking can we move the the apply
> error entry to PgStat_StatSubEntry. In that approach, we don't need to build a
> inner hash table when there are no table sync error entry.

I wanted to avoid having unnecessary error entry fields when there is
no apply worker error but there is a table sync worker error. But
after more thoughts, the apply worker is likely to raise an error than
table sync workers. So it might be better to have both
PgStat_StatSubErrEntry for the apply worker error and hash table for
table sync workers errors in PgStat_StatSubEntry.

>
> 4)
> Is it possible to add some testcases to test the subscription error entry delete ?

Do you mean the tests checking if subscription error entry is deleted
after DROP SUBSCRIPTION?

Those comments are incorporated into local branches. I'll submit the
updated patches after incorporating other comments.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 сентября 2021 г., 16:57:58

On Thu, Sep 2, 2021 at 8:37 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Mon, Aug 30, 2021 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached rebased patches. 0004 patch is not the scope of this
> > patch. It's borrowed from another thread[1] to fix the assertion
> > failure for newly added tests. Please review them.
>
> Hi,
>
> I reviewed the 0002 patch and have a suggestion for it.
>
> +                               if (IsSet(opts.specified_opts, SUBOPT_SYNCHRONOUS_COMMIT))
> +                               {
> +                                       values[Anum_pg_subscription_subsynccommit - 1] =
> +                                               CStringGetTextDatum("off");
> +                                       replaces[Anum_pg_subscription_subsynccommit - 1] = true;
> +                               }
>
> Currently, the patch set the default value out of parse_subscription_options(),
> but I think It might be more standard to set the value in
> parse_subscription_options(). Like:
>
>                         if (!is_reset)
>                         {
>                                 ...
> +                       }
> +                       else
> +                               opts->synchronous_commit = "off";
>
> And then, we can set the value like:
>
>                                         values[Anum_pg_subscription_subsynccommit - 1] =
>                                                 CStringGetTextDatum(opts.synchronous_commit);

You're right. Fixed.

>
>
> Besides, instead of adding a switch case like the following:
> +               case ALTER_SUBSCRIPTION_RESET_OPTIONS:
> +                       {
>
> We can add a bool flag(isReset) in AlterSubscriptionStmt and check the flag
> when invoking parse_subscription_options(). In this approach, the code can be
> shorter.
>
> Attach a diff file based on the v12-0002 which change the code like the above
> suggestion.

Thank you for the patch!

@@ -3672,11 +3671,12 @@ typedef enum AlterSubscriptionType
 typedef struct AlterSubscriptionStmt
 {
        NodeTag         type;
-       AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS, etc */
+       AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc */
        char       *subname;            /* Name of the subscription */
        char       *conninfo;           /* Connection string to publisher */
        List       *publication;        /* One or more publication to
subscribe to */
        List       *options;            /* List of DefElem nodes */
+       bool            isReset;                /* true if RESET option */
 } AlterSubscriptionStmt;

It's unnatural to me that AlterSubscriptionStmt has isReset flag even
in spite of having 'kind' indicating the command. How about having
RESET comand use the same logic of SET as you suggested while having
both ALTER_SUBSCRIPTION_SET_OPTIONS and
ALTER_SUBSCRIPTION_RESET_OPTIONS?

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 сентября 2021 г., 16:58:32

On Fri, Sep 3, 2021 at 3:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached rebased patches. 0004 patch is not the scope of this
> > patch. It's borrowed from another thread[1] to fix the assertion
> > failure for newly added tests. Please review them.
> >
>
> BTW, these patches need rebasing (broken by recent commits, patches
> 0001, 0003 and 0004 no longer apply, and it's failing in the cfbot).

Thanks! I'll submit the updated patches early this week.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: Skipping logical replication transactions on subscriber side

От

"houzj.fnst@fujitsu.com"

Дата:

06 сентября 2021 г., 04:26:54

From Sun, Sep 5, 2021 9:58 PM Masahiko Sawada <sawada.mshk@gmail.com>:
> On Thu, Sep 2, 2021 at 8:37 PM houzj.fnst@fujitsu.com <houzj.fnst@fujitsu.com> wrote:
> >
> > From Mon, Aug 30, 2021 3:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > I've attached rebased patches. 0004 patch is not the scope of this
> > > patch. It's borrowed from another thread[1] to fix the assertion
> > > failure for newly added tests. Please review them.
> >
> > Hi,
> >
> > I reviewed the 0002 patch and have a suggestion for it.
> @@ -3672,11 +3671,12 @@ typedef enum AlterSubscriptionType  typedef
> struct AlterSubscriptionStmt  {
>         NodeTag         type;
> -       AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_SET_OPTIONS,
> etc */
> +       AlterSubscriptionType kind; /* ALTER_SUBSCRIPTION_OPTIONS, etc
> + */
>         char       *subname;            /* Name of the subscription */
>         char       *conninfo;           /* Connection string to publisher */
>         List       *publication;        /* One or more publication to
> subscribe to */
>         List       *options;            /* List of DefElem nodes */
> +       bool            isReset;                /* true if RESET option */
>  } AlterSubscriptionStmt;
> 
> It's unnatural to me that AlterSubscriptionStmt has isReset flag even in spite of
> having 'kind' indicating the command. How about having RESET comand use
> the same logic of SET as you suggested while having both
> ALTER_SUBSCRIPTION_SET_OPTIONS and
> ALTER_SUBSCRIPTION_RESET_OPTIONS?

Yes, I agree with you it will look more natural with ALTER_SUBSCRIPTION_RESET_OPTIONS.

Best regards,
Hou zj

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

06 сентября 2021 г., 08:49:49

On Sat, Sep 4, 2021 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Sep 3, 2021 at 2:15 AM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> >
> > > On Aug 30, 2021, at 12:06 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached rebased patches.
> > For the v12-0003 patch:
> >
> > I believe this feature is needed, but it also seems like a very powerful foot-gun.  Can we do anything to make it
lesslikely that users will hurt themselves with this tool? 
> >
>
> This won't do any more harm than currently, users can do via
> pg_replication_slot_advance and the same is documented as well, see
> [1]. This will be allowed to only superusers.  Its effect will be
> documented with a precautionary note to use it only when the other
> available ways can't be used.

Right.

>
> > I am thinking back to support calls I have attended.  When a production system is down, there is often some
hesitancyto perform ad-hoc operations on the database, but once the decision has been made to do so, people try to get
thewhole process done as quickly as possible.  If multiple transactions on the publisher fail on the subscriber, they
willdo so in series, not in parallel. 
> >
>
> The subscriber will know only one transaction failure at a time, till
> that is resolved, the apply won't move ahead and it won't know even if
> there are other transactions that are going to fail in the future.
>
> >
> > If the user could instead clear all failed transactions of the same type, that might make it less likely that they
unthinkinglyalso skip subsequent errors of some different type.  Perhaps something like ALTER SUBSCRIPTION ... SET
(skip_failures= 'duplicate key value violates unique constraint "test_pkey"')? 
> >
>
> I think if we want we can allow to skip particular error via
> skip_error_code instead of via error message but not sure if it would
> be better to skip a particular operation of the transaction rather
> than the entire transaction. Normally from the atomicity purpose the
> transaction can be either committed or rolled-back but not partially
> done so I think it would be preferable to skip the entire transaction
> rather than skipping it partially.

I think the suggestion by Mark is to skip the entire transaction if
the kind of error matches the specified error.

I think my proposed feature is meant to be a tool to cover the
situation like where something should not happen have happened, rather
than conflict resolution.  If the users failed into a difficult
situation where they need to skip a lot of transaction by this
skip_xid feature, they should rebuild the logical replication from
scratch. It seems to me that skipping all transactions that failed due
to the same type of failure seems to be problematic, for example, if
the user forget to reset it. If we want to skip the particular
operation that failed due to the specified error, we should have a
proper conflict resolution feature that can handle various types of
conflicts by various types of resolutions methods, like other RDBMS
supports.

>
> >  This is arguably a different feature request, and not something your patch is required to address, but I wonder
howmuch we should limit people shooting themselves in the foot?  If we built something like this using your skip_xid
feature,rather than instead of your skip_xid feature, would your feature need to be modified? 
> >
>
> Sawada-San can answer better but I don't see any problem building any
> such feature on top of what is currently proposed.

If the feature you proposed is to skip the entire transaction, I also
don't see any problem building the feature on top of my patch. The
patch adds the mechanism to skip the entire transaction so what we
need to do for that feature is to extend how to trigger the skipping
behavior.

>
> >
> > I'm having trouble thinking of an example conflict where skipping a transaction would be better than writing a
BEFOREINSERT trigger on the conflicting table which suppresses or redirects conflicting rows somewhere else.
Particularlyfor larger transactions containing multiple statements, suppressing the conflicting rows using a trigger
wouldbe less messy than skipping the transaction.  I think your patch adds a useful tool to the toolkit, but maybe we
shouldmention more alternatives in the docs?  Something like, "changing the data on the subscriber so that it doesn't
conflictwith incoming changes, or dropping the conflicting constraint or unique index, or writing a trigger on the
subscriberto suppress or redirect conflicting incoming changes, or as a last resort, by skipping the whole
transaction"?
> >
>
> +1 for extending the docs as per this suggestion.

Agreed. I'll add such description to the doc.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

09 сентября 2021 г., 17:32:54

On Sun, Sep 5, 2021 at 10:58 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Sep 3, 2021 at 3:46 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached rebased patches. 0004 patch is not the scope of this
> > > patch. It's borrowed from another thread[1] to fix the assertion
> > > failure for newly added tests. Please review them.
> > >
> >
> > BTW, these patches need rebasing (broken by recent commits, patches
> > 0001, 0003 and 0004 no longer apply, and it's failing in the cfbot).
>
> Thanks! I'll submit the updated patches early this week.
>

Sorry for the late response. I've attached the updated patches that
incorporate all comments unless I missed something. Please review
them.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Sorry for the late reply. I was on vacation.

On Tue, Sep 14, 2021 at 11:27 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> From Thur, Sep 9, 2021 10:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Sorry for the late response. I've attached the updated patches that incorporate
> > all comments unless I missed something. Please review them.
>
> Thanks for the new version patches.
> Here are some comments for the v13-0001 patch.

Thank you for the comments!

>
> 1)
>
> +                                       pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
> +                                       pgstat_send(&errmsg, len);
> +                                       errmsg.m_nentries = 0;
> +                               }
>
> It seems we can invoke pgstat_setheader once before the loop like the
> following:
>
> +                       errmsg.m_nentries = 0;
> +                       errmsg.m_subid = subent->subid;
> +                       pgstat_setheader(&errmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONERRPURGE);
>
> 2)
> +                                       pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
> +                                       pgstat_send(&submsg, len);
>
> Same as 1), we can invoke pgstat_setheader once before the loop like:
> +               submsg.m_nentries = 0;
> +               pgstat_setheader(&submsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
>

But if we do that, we set the header even if there is no message to
send, right? Looking at other similar code in pgstat_vacuum_stat(), we
set the header just before sending the message. So I'd like to leave
them since it's cleaner.

>
> 3)
>
> +/* ----------
> + * PgStat_MsgSubscriptionErrPurge      Sent by the autovacuum to purge the subscription
> + *                                                                     errors.
>
> The comments said it's sent by autovacuum, would the manual vacuum also send
> this message ?

Right. Fixed.

>
>
> 4)
> +
> +       pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_reset) + sizeof(bool));
> +}
>
> Does it look cleaner that we use the offset of m_relid here like the following ?
>
> pgstat_send(&msg, offsetof(PgStat_MsgSubscriptionErr, m_relid));

Thank you for the suggestion. After more thought, it was a bit odd to
use PgStat_MsgSubscriptionErr to both report and reset the stats by
sending the part or the full struct. So in the latest version, I've
added a new message struct type to reset the subscription error
statistics.

I've attached the updated version patches. Please review them.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Mon, Sep 27, 2021 at 2:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 27, 2021 at 11:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Sep 27, 2021 at 12:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Sep 27, 2021 at 12:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Sep 27, 2021 at 6:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Sat, Sep 25, 2021 at 4:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > Sure, but each tablesync worker must have a separate relid. Why can't
> > > > > > we have a single hash table for both apply and table sync workers
> > > > > > which are hashed by sub_id + rel_id? For apply worker, the rel_id will
> > > > > > always be zero (InvalidOId) and tablesync workers will have a unique
> > > > > > OID for rel_id, so we should be able to uniquely identify each of
> > > > > > apply and table sync workers.
> > > > >
> > > > > What I imagined is to extend the subscription statistics, for
> > > > > instance, transaction stats[1]. By having a hash table for
> > > > > subscriptions, we can store those statistics into an entry of the hash
> > > > > table and we can think of subscription errors as also statistics of
> > > > > the subscription. So we can have another hash table for errors in an
> > > > > entry of the subscription hash table. For example, the subscription
> > > > > entry struct will be something like:
> > > > >
> > > > > typedef struct PgStat_StatSubEntry
> > > > > {
> > > > >     Oid subid; /* hash key */
> > > > >
> > > > >     HTAB *errors;    /* apply and table sync errors */
> > > > >
> > > > >     /* transaction stats of subscription */
> > > > >     PgStat_Counter xact_commit;
> > > > >     PgStat_Counter xact_commit_bytes;
> > > > >     PgStat_Counter xact_error;
> > > > >     PgStat_Counter xact_error_bytes;
> > > > >     PgStat_Counter xact_abort;
> > > > >     PgStat_Counter xact_abort_bytes;
> > > > >     PgStat_Counter failure_count;
> > > > > } PgStat_StatSubEntry;
> > > > >
> > > >
> > > > I think these additional stats will be displayed via
> > > > pg_stat_subscription, right? If so, the current stats of that view are
> > > > all in-memory and are per LogicalRepWorker which means that for those
> > > > stats also we will have different entries for apply and table sync
> > > > worker. If this understanding is correct, won't it be better to
> > > > represent this as below?
> > >
> > > I was thinking that we have a different stats view for example
> > > pg_stat_subscription_xacts that has entries per subscription. But your
> > > idea seems better to me.
> >
> > I mean that showing statistics (including transaction statistics and
> > errors) per logical replication worker seems better to me, no matter
> > what view shows these statistics. I'll change the patch in that way.
> >
>

I've attached updated patches that incorporate all comments I got so
far. Please review them.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Fri, Oct 8, 2021 at 8:17 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Thu, Sep 30, 2021 at 3:45 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached updated patches that incorporate all comments I got so
> > far. Please review them.
> >
>
> Some comments about the v15-0001 patch:

Thank you for the comments!

>
> (1) patch adds a whitespace error
>
> Applying: Add a subscription errors statistics view
> "pg_stat_subscription_errors".
> .git/rebase-apply/patch:1656: new blank line at EOF.
> +
> warning: 1 line adds whitespace errors.

Fixed.

>
> (2) Patch comment says "This commit adds a new system view
> pg_stat_logical_replication_errors ..."
> BUT this is the wrong name, it should be "pg_stat_subscription_errors".
>
>

Fixed.

> doc/src/sgml/monitoring.sgml
>
> (3)
> "Message of the error" doesn't sound right. I suggest just saying "The
> error message".

Fixed.

>
> (4) view column "last_failed_time"
> I think it would be better to name this "last_error_time".

Okay, fixed.

>
>
> src/backend/postmaster/pgstat.c
>
> (5) pgstat_vacuum_subworker_stats()
>
> Spelling mistake in the following comment:
>
> /* Create a map for mapping subscriptoin OID and database OID */
>
> subscriptoin -> subscription

Fixed.

>
> (6)
> In the following functions:
>
> pgstat_read_statsfiles
> pgstat_read_db_statsfile_timestamp
>
> The following comment should say "... struct describing subscription
> worker statistics."
> (i.e. need to remove the "a")
>
> + * 'S' A PgStat_StatSubWorkerEntry struct describing a
> + * subscription worker statistics.
>

Fixed.

>
> (7) pgstat_get_subworker_entry
>
> Suggest comment change:
>
> BEFORE:
> + * Return the entry of subscription worker entry with the subscription
> AFTER:
> + * Return subscription worker entry with the given subscription

Fixed.

>
> (8) pgstat_recv_subworker_error
>
> + /*
> + * Update only the counter and timestamp if we received the same error
> + * again
> + */
> + if (wentry->relid == msg->m_relid &&
> + wentry->command == msg->m_command &&
> + wentry->xid == msg->m_xid &&
> + strncmp(wentry->message, msg->m_message, strlen(wentry->message)) == 0)
> + {
>
> Is there a reason that the above check uses strncmp() with
> strlen(wentry->message), instead of just strcmp()?
> msg->m_message is treated as the same error message if it is the same
> up to strlen(wentry->message)?
> Perhaps if that is intentional, then the comment should be updated.

It's better to use strcmp() in this case. Fixed.

>
> src/tools/pgindent/typedefs.list
>
> (9)
> The added "PgStat_SubWorkerError" should be removed from the
> typedefs.list (as there is no such new typedef).

Fixed.

I've attached updated patches.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Thu, Oct 14, 2021 at 5:45 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Oct 12, 2021 at 4:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached updated patches.
> >
>
> A couple more comments for some issues that I noticed in the v16 patches:
>
> v16-0002
>
> doc/src/sgml/ref/alter_subscription.sgml
>
> (1) Order of parameters that can be reset doesn't match those that can be set.
> Also, it doesn't match the order specified in the documentation
> updates in the v16-0003 patch.
>
> Suggested change:
>
> BEFORE:
> +       The parameters that can be reset are: <literal>streaming</literal>,
> +       <literal>binary</literal>, <literal>synchronous_commit</literal>.
> AFTER:
> +       The parameters that can be reset are:
> <literal>synchronous_commit</literal>,
> +       <literal>binary</literal>, <literal>streaming</literal>.
>

Fixed.

>
> v16-0003
>
> doc/src/sgml/ref/alter_subscription.sgml
>
> (1) Documentation update says "slot_name" is a parameter that can be
> reset, but this is not correct, it can't be reset.
> Also, the doc update is missing "the" before "parameter".
>
> Suggested change:
>
> BEFORE:
> +      The parameters that can be reset are: <literal>slot_name</literal>,
> +      <literal>synchronous_commit</literal>, <literal>binary</literal>,
> +      <literal>streaming</literal>, and following parameter:
> AFTER:
> +      The parameters that can be reset are:
> <literal>synchronous_commit</literal>,
> +      <literal>binary</literal>, <literal>streaming</literal>, and
> the following
> +      parameter:

Fixed.

I've attached updated patches that incorporate all comments I got so far.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Wed, Oct 20, 2021 at 12:33 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Oct 18, 2021 at 12:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached updated patches that incorporate all comments I got so far.
> >
>
> Minor comment on patch 17-0003

Thank you for the comment!

>
> src/backend/replication/logical/worker.c
>
> (1) Typo in apply_handle_stream_abort() comment:
>
> /* Stop skipping transaction transaction, if enabled */
> should be:
> /* Stop skipping transaction changes, if enabled */

Fixed.

I've attached updated patches. In this version, in addition to the
review comments I go so far, I've changed the view name from
pg_stat_subscription_errors to pg_stat_subscription_workers as per the
discussion on including xact info to the view on another thread[1].
I’ve also changed related codes accordingly.

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoDF7LmSALzMfmPshRw_xFcRz3WvB-me8T2gO6Ht%3D3zL2w%40mail.gmail.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > >
> > > > I've attached updated patches.
> >
> > Thank you for the comments!
> >
> > >
> > > Few comments:
> > > ==============
> > > 1. Is the patch cleaning tablesync error entries except via vacuum? If
> > > not, can't we send a message to remove tablesync errors once tablesync
> > > is successful (say when we reset skip_xid or when tablesync is
> > > finished) or when we drop subscription? I think the same applies to
> > > apply worker. I think we may want to track it in some way whether an
> > > error has occurred before sending the message but relying completely
> > > on a vacuum might be the recipe of bloat. I think in the case of a
> > > drop subscription we can simply send the message as that is not a
> > > frequent operation. I might be missing something here because in the
> > > tests after drop subscription you are expecting the entries from the
> > > view to get cleared
> >
> > Yes, I think we can have tablesync worker send a message to drop stats
> > once tablesync is successful. But if we do that also when dropping a
> > subscription, I think we need to do that only the transaction is
> > committed since we can drop a subscription that doesn't have a
> > replication slot and rollback the transaction. Probably we can send
> > the message only when the subscritpion does have a replication slot.
> >
>
> Right. And probably for apply worker after updating skip xid.

I'm not sure it's better to drop apply worker stats after resetting
skip xid (i.g., after skipping the transaction). Since the view is a
cumulative view and has last_error_time, I thought we can have the
apply worker stats until the subscription gets dropped. Since the
error reporting message could get lost, no entry in the view doesn’t
mean the worker doesn’t face an issue.

>
> > In other cases, we can remember the subscriptions being dropped and
> > send the message to drop the statistics of them after committing the
> > transaction but I’m not sure it’s worth having it.
> >
>
> Yeah, let's not go to that extent. I think in most cases subscriptions
> will have corresponding slots.

Agreed.

>
>  FWIW, we completely
> > rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
> > functions. And we don't expect there are many subscriptions on the
> > database.
> >
>
> True, but we do send it for the database, so let's do it for the cases
> you explained in the first paragraph.

Agreed.

I've attached a new version patch. Since the syntax of skipping
transaction id is under the discussion I've attached only the error
reporting patch for now.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v19-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

29 октября 2021 г., 08:29:52

On Thu, Oct 28, 2021 at 7:47 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, Oct 21, 2021 at 10:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Oct 20, 2021 at 12:33 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > >
> > > On Mon, Oct 18, 2021 at 12:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I've attached updated patches that incorporate all comments I got so far.
> > > >
> > >
> > > Minor comment on patch 17-0003
> >
> > Thank you for the comment!
> >
> > >
> > > src/backend/replication/logical/worker.c
> > >
> > > (1) Typo in apply_handle_stream_abort() comment:
> > >
> > > /* Stop skipping transaction transaction, if enabled */
> > > should be:
> > > /* Stop skipping transaction changes, if enabled */
> >
> > Fixed.
> >
> > I've attached updated patches.
>
> I have started to have a look at the feature and review the patch, my
> initial comments:

Thank you for the comments!

> 1) I could specify invalid subscriber id to
> pg_stat_reset_subscription_worker which creates an assertion failure?
>
> +static void
> +pgstat_recv_resetsubworkercounter(PgStat_MsgResetsubworkercounter
> *msg, int len)
> +{
> +       PgStat_StatSubWorkerEntry *wentry;
> +
> +       Assert(OidIsValid(msg->m_subid));
> +
> +       /* Get subscription worker stats */
> +       wentry = pgstat_get_subworker_entry(msg->m_subid,
> msg->m_subrelid, false);
>
> postgres=# select pg_stat_reset_subscription_worker(NULL, NULL);
>  pg_stat_reset_subscription_worker
> -----------------------------------
>
> (1 row)
>
> TRAP: FailedAssertion("OidIsValid(msg->m_subid)", File: "pgstat.c",
> Line: 5742, PID: 789588)
> postgres: stats collector (ExceptionalCondition+0xd0)[0x55d33bba4778]
> postgres: stats collector (+0x545a43)[0x55d33b90aa43]
> postgres: stats collector (+0x541fad)[0x55d33b906fad]
> postgres: stats collector (pgstat_start+0xdd)[0x55d33b9020e1]
> postgres: stats collector (+0x54ae0c)[0x55d33b90fe0c]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7f8509ccc1f0]
> /lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7f8509a78ac7]
> postgres: stats collector (+0x548cab)[0x55d33b90dcab]
> postgres: stats collector (PostmasterMain+0x134c)[0x55d33b90d5c6]
> postgres: stats collector (+0x43b8be)[0x55d33b8008be]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7f8509992565]
> postgres: stats collector (_start+0x2e)[0x55d33b48e4fe]

Good catch. Fixed.

>
> 2) I was able to provide invalid relation id for
> pg_stat_reset_subscription_worker? Should we add any validation for
> this?
> select pg_stat_reset_subscription_worker(16389, -1);
>
> +pg_stat_reset_subscription_worker(PG_FUNCTION_ARGS)
> +{
> +       Oid                     subid = PG_GETARG_OID(0);
> +       Oid                     relid;
> +
> +       if (PG_ARGISNULL(1))
> +               relid = InvalidOid;             /* reset apply worker
> error stats */
> +       else
> +               relid = PG_GETARG_OID(1);       /* reset table sync
> worker error stats */
> +
> +       pgstat_reset_subworker_stats(subid, relid);
> +
> +       PG_RETURN_VOID();
> +}

I think that validation is not necessarily necessary. OID '-1' is interpreted as
4294967295 and we don't reject it.

>
> 3) 025_error_report test is failing because of one of the recent
> commit that has made some changes in the way node is initialized in
> the tap tests, corresponding changes need to be done in
> 025_error_report:
> t/025_error_report.pl .............. Dubious, test returned 2 (wstat 512, 0x200)
> No subtests run
> t/100_bugs.pl ...................... ok

Fixed.

These comments are incorporated into the latest version patch I just
submitted[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoDY-9_x819F_m1_wfCVXXFJrGiSmR2MfC9Nw4nW8Om0qA%40mail.gmail.com


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

29 октября 2021 г., 12:02:29

On Fri, Oct 29, 2021 at 6:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Oct 28, 2021 at 6:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Oct 28, 2021 at 10:56 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > >
> > > Another thing I'm concerned is that the syntax "SKIP (
> > > subscription_parameter [=value] [, ...])" looks like we can specify
> > > multiple options for example, "SKIP (xid = '100', lsn =
> > > '0/12345678’)”. Is there a case where we need to specify multiple
> > > options? Perhaps when specifying the target XID and operations for
> > > example, “SKIP (xid = 100, action = ‘insert, update’)”?
> > >
> >
> > Yeah, or maybe prepared transaction identifier and actions.
>
> Prepared transactions seem not to need to be skipped since those
> changes are already successfully applied, though.
>

I think it can also fail before apply of prepare is successful. Right
now, we are just logging xid in error cases bug gid could also be
logged as we receive that in begin_prepare. I think currently xid is
sufficient but I have given this as an example for future
consideration.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

29 октября 2021 г., 14:19:59

On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > >
> > > > > I've attached updated patches.
> > >
> > > Thank you for the comments!
> > >
> > > >
> > > > Few comments:
> > > > ==============
> > > > 1. Is the patch cleaning tablesync error entries except via vacuum? If
> > > > not, can't we send a message to remove tablesync errors once tablesync
> > > > is successful (say when we reset skip_xid or when tablesync is
> > > > finished) or when we drop subscription? I think the same applies to
> > > > apply worker. I think we may want to track it in some way whether an
> > > > error has occurred before sending the message but relying completely
> > > > on a vacuum might be the recipe of bloat. I think in the case of a
> > > > drop subscription we can simply send the message as that is not a
> > > > frequent operation. I might be missing something here because in the
> > > > tests after drop subscription you are expecting the entries from the
> > > > view to get cleared
> > >
> > > Yes, I think we can have tablesync worker send a message to drop stats
> > > once tablesync is successful. But if we do that also when dropping a
> > > subscription, I think we need to do that only the transaction is
> > > committed since we can drop a subscription that doesn't have a
> > > replication slot and rollback the transaction. Probably we can send
> > > the message only when the subscritpion does have a replication slot.
> > >
> >
> > Right. And probably for apply worker after updating skip xid.
>
> I'm not sure it's better to drop apply worker stats after resetting
> skip xid (i.g., after skipping the transaction). Since the view is a
> cumulative view and has last_error_time, I thought we can have the
> apply worker stats until the subscription gets dropped.
>

Fair enough. So statistics can be removed either by vacuum or drop
subscription. Also, if we go by this logic then there is no harm in
retaining the stat entries for tablesync errors. Why have different
behavior for apply and tablesync workers?

I have another question in this regard. Currently, the reset function
seems to be resetting only the first stat entry for a subscription.
But can't we have multiple stat entries for a subscription considering
the view's cumulative nature?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

30 октября 2021 г., 06:21:05

On Fri, Oct 29, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I'm not sure it's better to drop apply worker stats after resetting
> > skip xid (i.g., after skipping the transaction). Since the view is a
> > cumulative view and has last_error_time, I thought we can have the
> > apply worker stats until the subscription gets dropped.
> >
>
> Fair enough. So statistics can be removed either by vacuum or drop
> subscription. Also, if we go by this logic then there is no harm in
> retaining the stat entries for tablesync errors. Why have different
> behavior for apply and tablesync workers?
>
> I have another question in this regard. Currently, the reset function
> seems to be resetting only the first stat entry for a subscription.
> But can't we have multiple stat entries for a subscription considering
> the view's cumulative nature?
>

Don't we want these stats to be dealt in the same way as tables and
functions as all the stats entries (subscription entries) are specific
to a particular database? If so, I think we should write/read these
to/from db specific stats file in the same way as we do for tables or
functions. I think in the current patch, it will unnecessarily read
and probably write subscription stats even when those are not
required.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

01 ноября 2021 г., 04:48:17

On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > >
> > > > > > I've attached updated patches.
> > > >
> > > > Thank you for the comments!
> > > >
> > > > >
> > > > > Few comments:
> > > > > ==============
> > > > > 1. Is the patch cleaning tablesync error entries except via vacuum? If
> > > > > not, can't we send a message to remove tablesync errors once tablesync
> > > > > is successful (say when we reset skip_xid or when tablesync is
> > > > > finished) or when we drop subscription? I think the same applies to
> > > > > apply worker. I think we may want to track it in some way whether an
> > > > > error has occurred before sending the message but relying completely
> > > > > on a vacuum might be the recipe of bloat. I think in the case of a
> > > > > drop subscription we can simply send the message as that is not a
> > > > > frequent operation. I might be missing something here because in the
> > > > > tests after drop subscription you are expecting the entries from the
> > > > > view to get cleared
> > > >
> > > > Yes, I think we can have tablesync worker send a message to drop stats
> > > > once tablesync is successful. But if we do that also when dropping a
> > > > subscription, I think we need to do that only the transaction is
> > > > committed since we can drop a subscription that doesn't have a
> > > > replication slot and rollback the transaction. Probably we can send
> > > > the message only when the subscritpion does have a replication slot.
> > > >
> > >
> > > Right. And probably for apply worker after updating skip xid.
> >
> > I'm not sure it's better to drop apply worker stats after resetting
> > skip xid (i.g., after skipping the transaction). Since the view is a
> > cumulative view and has last_error_time, I thought we can have the
> > apply worker stats until the subscription gets dropped.
> >
>
> Fair enough. So statistics can be removed either by vacuum or drop
> subscription. Also, if we go by this logic then there is no harm in
> retaining the stat entries for tablesync errors. Why have different
> behavior for apply and tablesync workers?

My understanding is that the subscription worker statistics entry
corresponds to workers (but not physical workers since the physical
process is changed after restarting). So if the worker finishes its
jobs, it is no longer necessary to show errors since further problems
will not occur after that. Table sync worker’s job finishes when
completing table copy (unless table sync is performed again by REFRESH
PUBLICATION) whereas apply worker’s job finishes when the subscription
is dropped. Also, I’m concerned about a situation like where a lot of
table sync failed. In which case, if we don’t drop table sync worker
statistics after completing its job, we end up having a lot of entries
in the view unless the subscription is dropped.

>
> I have another question in this regard. Currently, the reset function
> seems to be resetting only the first stat entry for a subscription.
> But can't we have multiple stat entries for a subscription considering
> the view's cumulative nature?

I might be missing your points but I think that with the current
patch, the view has multiple entries for a subscription. That is,
there is one apply worker stats and multiple table sync worker stats
per subscription. And pg_stat_reset_subscription() function can reset
any stats by specifying subscription OID and relation OID.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

01 ноября 2021 г., 04:54:31

On Sat, Oct 30, 2021 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Oct 29, 2021 at 4:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Oct 29, 2021 at 10:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I'm not sure it's better to drop apply worker stats after resetting
> > > skip xid (i.g., after skipping the transaction). Since the view is a
> > > cumulative view and has last_error_time, I thought we can have the
> > > apply worker stats until the subscription gets dropped.
> > >
> >
> > Fair enough. So statistics can be removed either by vacuum or drop
> > subscription. Also, if we go by this logic then there is no harm in
> > retaining the stat entries for tablesync errors. Why have different
> > behavior for apply and tablesync workers?
> >
> > I have another question in this regard. Currently, the reset function
> > seems to be resetting only the first stat entry for a subscription.
> > But can't we have multiple stat entries for a subscription considering
> > the view's cumulative nature?
> >
>
> Don't we want these stats to be dealt in the same way as tables and
> functions as all the stats entries (subscription entries) are specific
> to a particular database? If so, I think we should write/read these
> to/from db specific stats file in the same way as we do for tables or
> functions. I think in the current patch, it will unnecessarily read
> and probably write subscription stats even when those are not
> required.

Good point! So probably we should have PgStat_StatDBEntry have the
hash table for subscription worker statistics, right?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

02 ноября 2021 г., 06:51:34

On Fri, Oct 29, 2021 at 4:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached a new version patch. Since the syntax of skipping
> transaction id is under the discussion I've attached only the error
> reporting patch for now.
>

I have some comments on the v19-0001 patch:

v19-0001

(1) doc/src/sgml/monitoring.sgml
Seems to be missing the word "information":

BEFORE:
+      <entry>At least one row per subscription, showing about errors that
+      occurred on subscription.
AFTER:
+      <entry>At least one row per subscription, showing information about
+      errors that occurred on subscription.

(2) pg_stat_reset_subscription_worker(subid Oid, relid Oid)
First of all, I think that the documentation for this function should
make it clear that a non-NULL "subid" parameter is required for both
reset cases (tablesync and apply).
Perhaps this could be done by simply changing the first sentence to say:
"Resets statistics of a single subscription worker error, for a worker
running on subscription with <parameter>subid</parameter>."
(and then can remove " running on the subscription with
<parameter>subid</parameter>" from the last sentence)

I think that the documentation for this function should say that it
should be used in conjunction with the "pg_stat_subscription_workers"
view in order to obtain the required subid/relid values for resetting.
(and should provide a link to the documentation for that view)
Also, I think that the function documentation should make it clear
that the tablesync error case is indicated by a NULL "command" in the
information returned from the "pg_stat_subscription_workers" view
(otherwise the user needs to look at the server log in order to
determine whether the error is for the apply/tablesync worker).

Finally, there are currently no tests for this new function.

(3)  pg_stat_subscription_workers
In the documentation for this, the description for the "command"
column says: "This field is always NULL if the error was reported
during the initial data copy."
Some users may not realise that this refers to "tablesync", so perhaps
add " (tablesync)" to the end of this sentence, or similar.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

02 ноября 2021 г., 08:34:57

On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Fair enough. So statistics can be removed either by vacuum or drop
> > subscription. Also, if we go by this logic then there is no harm in
> > retaining the stat entries for tablesync errors. Why have different
> > behavior for apply and tablesync workers?
>
> My understanding is that the subscription worker statistics entry
> corresponds to workers (but not physical workers since the physical
> process is changed after restarting). So if the worker finishes its
> jobs, it is no longer necessary to show errors since further problems
> will not occur after that. Table sync worker’s job finishes when
> completing table copy (unless table sync is performed again by REFRESH
> PUBLICATION) whereas apply worker’s job finishes when the subscription
> is dropped.
>

Actually, I am not very sure how users can use the old error
information after we allowed skipping the conflicting xid. Say, if
they want to add/remove some constraints on the table based on
previous errors then they might want to refer to errors of both the
apply worker and table sync worker.

> Also, I’m concerned about a situation like where a lot of
> table sync failed. In which case, if we don’t drop table sync worker
> statistics after completing its job, we end up having a lot of entries
> in the view unless the subscription is dropped.
>

True, but the same could be said for apply workers where errors can be
accumulated over a period of time.

> >
> > I have another question in this regard. Currently, the reset function
> > seems to be resetting only the first stat entry for a subscription.
> > But can't we have multiple stat entries for a subscription considering
> > the view's cumulative nature?
>
> I might be missing your points but I think that with the current
> patch, the view has multiple entries for a subscription. That is,
> there is one apply worker stats and multiple table sync worker stats
> per subscription.
>

Can't we have multiple entries for one apply worker?

> And pg_stat_reset_subscription() function can reset
> any stats by specifying subscription OID and relation OID.
>

Say, if the user has supplied just subscription OID then isn't it
better to reset all the error entries for that subscription?

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

02 ноября 2021 г., 08:36:27

On Mon, Nov 1, 2021 at 7:25 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Oct 30, 2021 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Don't we want these stats to be dealt in the same way as tables and
> > functions as all the stats entries (subscription entries) are specific
> > to a particular database? If so, I think we should write/read these
> > to/from db specific stats file in the same way as we do for tables or
> > functions. I think in the current patch, it will unnecessarily read
> > and probably write subscription stats even when those are not
> > required.
>
> Good point! So probably we should have PgStat_StatDBEntry have the
> hash table for subscription worker statistics, right?
>

Yes.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

02 ноября 2021 г., 11:47:14

On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Fair enough. So statistics can be removed either by vacuum or drop
> > > subscription. Also, if we go by this logic then there is no harm in
> > > retaining the stat entries for tablesync errors. Why have different
> > > behavior for apply and tablesync workers?
> >
> > My understanding is that the subscription worker statistics entry
> > corresponds to workers (but not physical workers since the physical
> > process is changed after restarting). So if the worker finishes its
> > jobs, it is no longer necessary to show errors since further problems
> > will not occur after that. Table sync worker’s job finishes when
> > completing table copy (unless table sync is performed again by REFRESH
> > PUBLICATION) whereas apply worker’s job finishes when the subscription
> > is dropped.
> >
>
> Actually, I am not very sure how users can use the old error
> information after we allowed skipping the conflicting xid. Say, if
> they want to add/remove some constraints on the table based on
> previous errors then they might want to refer to errors of both the
> apply worker and table sync worker.

I think that in general, statistics should be retained as long as a
corresponding object exists on the database, like other cumulative
statistic views. So I’m concerned that an entry of a cumulative stats
view is automatically removed by a non-stats-related function (i.g.,
ALTER SUBSCRIPTION SKIP). Which seems a new behavior for cumulative
stats views.

We can retain the stats entries for table sync worker but what I want
to avoid is that the view shows many old entries that will never be
updated. I've sometimes seen cases where the user mistakenly restored
table data on the subscriber before creating a subscription, failed
table sync on many tables due to unique violation, and truncated
tables on the subscriber. I think that unlike the stats entries for
apply worker, retaining the stats entries for table sync could be
harmful since it’s likely to be a large amount (even hundreds of
entries). Especially, it could lead to bloat the stats file since it
has an error message. So if we do that, I'd like to provide a function
for users to remove (not reset) stats entries manually. Even if we
removed stats entries after skipping the transaction in question, the
stats entries would be left if we resolve the conflict in another way.

>
> > >
> > > I have another question in this regard. Currently, the reset function
> > > seems to be resetting only the first stat entry for a subscription.
> > > But can't we have multiple stat entries for a subscription considering
> > > the view's cumulative nature?
> >
> > I might be missing your points but I think that with the current
> > patch, the view has multiple entries for a subscription. That is,
> > there is one apply worker stats and multiple table sync worker stats
> > per subscription.
> >
>
> Can't we have multiple entries for one apply worker?

Umm, I think we have one stats entry per one logical replication
worker (apply worker or table sync worker). Am I missing something?

>
> > And pg_stat_reset_subscription() function can reset
> > any stats by specifying subscription OID and relation OID.
> >
>
> Say, if the user has supplied just subscription OID then isn't it
> better to reset all the error entries for that subscription?

Agreed. So pg_stat_reset_subscription_worker(oid) removes all errors
for the subscription whereas pg_stat_reset_subscription_worker(oid,
null) reset only the apply worker error for the subscription?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: Skipping logical replication transactions on subscriber side

От

"tanghy.fnst@fujitsu.com"

Дата:

02 ноября 2021 г., 12:45:27

On Friday, October 29, 2021 1:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> I've attached a new version patch. Since the syntax of skipping
> transaction id is under the discussion I've attached only the error
> reporting patch for now.
> 
> 

Thanks for your patch. Some comments on 026_error_report.pl file.

1. For test_tab_streaming table, the test only checks initial table sync and
doesn't check anything related to the new view pg_stat_subscription_workers. Do
you want to add more test cases for it?

2. The subscriptions are created with two_phase option on, but I didn't see two
phase transactions. Should we add some test cases for two phase transactions?

3. Errors reported by table sync worker will be cleaned up if the table sync
worker finish, should we add this case to the test? (After checking the table
sync worker's error in the view, delete data which caused the error, then check
the view again after table sync worker finished.)

Regards
Tang

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

02 ноября 2021 г., 13:07:18

On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > > >
> > > > I have another question in this regard. Currently, the reset function
> > > > seems to be resetting only the first stat entry for a subscription.
> > > > But can't we have multiple stat entries for a subscription considering
> > > > the view's cumulative nature?
> > >
> > > I might be missing your points but I think that with the current
> > > patch, the view has multiple entries for a subscription. That is,
> > > there is one apply worker stats and multiple table sync worker stats
> > > per subscription.
> > >
> >
> > Can't we have multiple entries for one apply worker?
>
> Umm, I think we have one stats entry per one logical replication
> worker (apply worker or table sync worker). Am I missing something?
>

No, you are right. I got confused.

> >
> > > And pg_stat_reset_subscription() function can reset
> > > any stats by specifying subscription OID and relation OID.
> > >
> >
> > Say, if the user has supplied just subscription OID then isn't it
> > better to reset all the error entries for that subscription?
>
> Agreed. So pg_stat_reset_subscription_worker(oid) removes all errors
> for the subscription whereas pg_stat_reset_subscription_worker(oid,
> null) reset only the apply worker error for the subscription?
>

Yes.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

03 ноября 2021 г., 06:41:26

On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Fair enough. So statistics can be removed either by vacuum or drop
> > > > subscription. Also, if we go by this logic then there is no harm in
> > > > retaining the stat entries for tablesync errors. Why have different
> > > > behavior for apply and tablesync workers?
> > >
> > > My understanding is that the subscription worker statistics entry
> > > corresponds to workers (but not physical workers since the physical
> > > process is changed after restarting). So if the worker finishes its
> > > jobs, it is no longer necessary to show errors since further problems
> > > will not occur after that. Table sync worker’s job finishes when
> > > completing table copy (unless table sync is performed again by REFRESH
> > > PUBLICATION) whereas apply worker’s job finishes when the subscription
> > > is dropped.
> > >
> >
> > Actually, I am not very sure how users can use the old error
> > information after we allowed skipping the conflicting xid. Say, if
> > they want to add/remove some constraints on the table based on
> > previous errors then they might want to refer to errors of both the
> > apply worker and table sync worker.
>
> I think that in general, statistics should be retained as long as a
> corresponding object exists on the database, like other cumulative
> statistic views. So I’m concerned that an entry of a cumulative stats
> view is automatically removed by a non-stats-related function (i.g.,
> ALTER SUBSCRIPTION SKIP). Which seems a new behavior for cumulative
> stats views.
>
> We can retain the stats entries for table sync worker but what I want
> to avoid is that the view shows many old entries that will never be
> updated. I've sometimes seen cases where the user mistakenly restored
> table data on the subscriber before creating a subscription, failed
> table sync on many tables due to unique violation, and truncated
> tables on the subscriber. I think that unlike the stats entries for
> apply worker, retaining the stats entries for table sync could be
> harmful since it’s likely to be a large amount (even hundreds of
> entries). Especially, it could lead to bloat the stats file since it
> has an error message. So if we do that, I'd like to provide a function
> for users to remove (not reset) stats entries manually.
>

If we follow the idea of keeping stats at db level (in
PgStat_StatDBEntry) as discussed above then I think we already have a
way to remove stat entries via pg_stat_reset which removes the stats
corresponding to tables, functions and after this patch corresponding
to subscriptions as well for the current database. Won't that be
sufficient? I see your point but I think it may be better if we keep
the same behavior for stats of apply and table sync workers.

Following the tables, functions, I thought of keeping the name of the
reset function similar to "pg_stat_reset_single_table_counters" but I
feel the currently used name "pg_stat_reset_subscription_worker" in
the patch is better. Do let me know what you think?

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

04 ноября 2021 г., 18:57:43

On Fri, Oct 29, 2021 at 10:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > >
> > > > > I've attached updated patches.
> > >
> > > Thank you for the comments!
> > >
> > > >
> > > > Few comments:
> > > > ==============
> > > > 1. Is the patch cleaning tablesync error entries except via vacuum? If
> > > > not, can't we send a message to remove tablesync errors once tablesync
> > > > is successful (say when we reset skip_xid or when tablesync is
> > > > finished) or when we drop subscription? I think the same applies to
> > > > apply worker. I think we may want to track it in some way whether an
> > > > error has occurred before sending the message but relying completely
> > > > on a vacuum might be the recipe of bloat. I think in the case of a
> > > > drop subscription we can simply send the message as that is not a
> > > > frequent operation. I might be missing something here because in the
> > > > tests after drop subscription you are expecting the entries from the
> > > > view to get cleared
> > >
> > > Yes, I think we can have tablesync worker send a message to drop stats
> > > once tablesync is successful. But if we do that also when dropping a
> > > subscription, I think we need to do that only the transaction is
> > > committed since we can drop a subscription that doesn't have a
> > > replication slot and rollback the transaction. Probably we can send
> > > the message only when the subscritpion does have a replication slot.
> > >
> >
> > Right. And probably for apply worker after updating skip xid.
>
> I'm not sure it's better to drop apply worker stats after resetting
> skip xid (i.g., after skipping the transaction). Since the view is a
> cumulative view and has last_error_time, I thought we can have the
> apply worker stats until the subscription gets dropped. Since the
> error reporting message could get lost, no entry in the view doesn’t
> mean the worker doesn’t face an issue.
>
> >
> > > In other cases, we can remember the subscriptions being dropped and
> > > send the message to drop the statistics of them after committing the
> > > transaction but I’m not sure it’s worth having it.
> > >
> >
> > Yeah, let's not go to that extent. I think in most cases subscriptions
> > will have corresponding slots.
>
> Agreed.
>
> >
> >  FWIW, we completely
> > > rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
> > > functions. And we don't expect there are many subscriptions on the
> > > database.
> > >
> >
> > True, but we do send it for the database, so let's do it for the cases
> > you explained in the first paragraph.
>
> Agreed.
>
> I've attached a new version patch. Since the syntax of skipping
> transaction id is under the discussion I've attached only the error
> reporting patch for now.

Thanks for the updated patch, few comments:
1) This check and return can be moved above CreateTemplateTupleDesc so
that the tuple descriptor need not be created if there is no worker
statistics
+       BlessTupleDesc(tupdesc);
+
+       /* Get subscription worker stats */
+       wentry = pgstat_fetch_subworker(subid, subrelid);
+
+       /* Return NULL if there is no worker statistics */
+       if (wentry == NULL)
+               PG_RETURN_NULL();
+
+       /* Initialise values and NULL flags arrays */
+       MemSet(values, 0, sizeof(values));
+       MemSet(nulls, 0, sizeof(nulls));

2) "NULL for the main apply worker" is mentioned as "null for the main
apply worker" in case of pg_stat_subscription view, we can mention it
similarly.
+      <para>
+       OID of the relation that the worker is synchronizing; NULL for the
+       main apply worker
+      </para></entry>

3) Variable assignment can be done during declaration and this the
assignment can be removed
+       i = 0;
+       /* subid */
+       values[i++] = ObjectIdGetDatum(subid);

4) I noticed that the worker error is still present when queried from
pg_stat_subscription_workers even after conflict is resolved in the
subscriber and the worker proceeds with applying the other
transactions, should this be documented somewhere?

5) This needs to be aligned, the columns in select have used TAB, we
should align it using spaces.
+CREATE VIEW pg_stat_subscription_workers AS
+    SELECT
+       w.subid,
+       s.subname,
+       w.subrelid,
+       w.relid,
+       w.command,
+       w.xid,
+       w.error_count,
+       w.error_message,
+       w.last_error_time,
+       w.stats_reset

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

07 ноября 2021 г., 17:19:48

On Wed, Nov 3, 2021 at 12:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > Fair enough. So statistics can be removed either by vacuum or drop
> > > > > subscription. Also, if we go by this logic then there is no harm in
> > > > > retaining the stat entries for tablesync errors. Why have different
> > > > > behavior for apply and tablesync workers?
> > > >
> > > > My understanding is that the subscription worker statistics entry
> > > > corresponds to workers (but not physical workers since the physical
> > > > process is changed after restarting). So if the worker finishes its
> > > > jobs, it is no longer necessary to show errors since further problems
> > > > will not occur after that. Table sync worker’s job finishes when
> > > > completing table copy (unless table sync is performed again by REFRESH
> > > > PUBLICATION) whereas apply worker’s job finishes when the subscription
> > > > is dropped.
> > > >
> > >
> > > Actually, I am not very sure how users can use the old error
> > > information after we allowed skipping the conflicting xid. Say, if
> > > they want to add/remove some constraints on the table based on
> > > previous errors then they might want to refer to errors of both the
> > > apply worker and table sync worker.
> >
> > I think that in general, statistics should be retained as long as a
> > corresponding object exists on the database, like other cumulative
> > statistic views. So I’m concerned that an entry of a cumulative stats
> > view is automatically removed by a non-stats-related function (i.g.,
> > ALTER SUBSCRIPTION SKIP). Which seems a new behavior for cumulative
> > stats views.
> >
> > We can retain the stats entries for table sync worker but what I want
> > to avoid is that the view shows many old entries that will never be
> > updated. I've sometimes seen cases where the user mistakenly restored
> > table data on the subscriber before creating a subscription, failed
> > table sync on many tables due to unique violation, and truncated
> > tables on the subscriber. I think that unlike the stats entries for
> > apply worker, retaining the stats entries for table sync could be
> > harmful since it’s likely to be a large amount (even hundreds of
> > entries). Especially, it could lead to bloat the stats file since it
> > has an error message. So if we do that, I'd like to provide a function
> > for users to remove (not reset) stats entries manually.
> >
>
> If we follow the idea of keeping stats at db level (in
> PgStat_StatDBEntry) as discussed above then I think we already have a
> way to remove stat entries via pg_stat_reset which removes the stats
> corresponding to tables, functions and after this patch corresponding
> to subscriptions as well for the current database. Won't that be
> sufficient? I see your point but I think it may be better if we keep
> the same behavior for stats of apply and table sync workers.

Make sense.

>
> Following the tables, functions, I thought of keeping the name of the
> reset function similar to "pg_stat_reset_single_table_counters" but I
> feel the currently used name "pg_stat_reset_subscription_worker" in
> the patch is better. Do let me know what you think?

Yeah, I also tend to prefer pg_stat_reset_subscription_worker name
since "single" isn't clear in the context of subscription worker.  And
the behavior of the reset function for subscription workers is also
different from pg_stat_reset_single_xxx_counters.

I've attached an updated patch. In this version patch, subscription
worker statistics are collected per-database and handled in a similar
way to tables and functions. I think perhaps we still need to discuss
details of how the statistics should be handled but I'd like to share
the patch for discussion.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v20-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

07 ноября 2021 г., 17:21:20

On Fri, Nov 5, 2021 at 12:57 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Oct 29, 2021 at 10:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > >
> > > > > > I've attached updated patches.
> > > >
> > > > Thank you for the comments!
> > > >
> > > > >
> > > > > Few comments:
> > > > > ==============
> > > > > 1. Is the patch cleaning tablesync error entries except via vacuum? If
> > > > > not, can't we send a message to remove tablesync errors once tablesync
> > > > > is successful (say when we reset skip_xid or when tablesync is
> > > > > finished) or when we drop subscription? I think the same applies to
> > > > > apply worker. I think we may want to track it in some way whether an
> > > > > error has occurred before sending the message but relying completely
> > > > > on a vacuum might be the recipe of bloat. I think in the case of a
> > > > > drop subscription we can simply send the message as that is not a
> > > > > frequent operation. I might be missing something here because in the
> > > > > tests after drop subscription you are expecting the entries from the
> > > > > view to get cleared
> > > >
> > > > Yes, I think we can have tablesync worker send a message to drop stats
> > > > once tablesync is successful. But if we do that also when dropping a
> > > > subscription, I think we need to do that only the transaction is
> > > > committed since we can drop a subscription that doesn't have a
> > > > replication slot and rollback the transaction. Probably we can send
> > > > the message only when the subscritpion does have a replication slot.
> > > >
> > >
> > > Right. And probably for apply worker after updating skip xid.
> >
> > I'm not sure it's better to drop apply worker stats after resetting
> > skip xid (i.g., after skipping the transaction). Since the view is a
> > cumulative view and has last_error_time, I thought we can have the
> > apply worker stats until the subscription gets dropped. Since the
> > error reporting message could get lost, no entry in the view doesn’t
> > mean the worker doesn’t face an issue.
> >
> > >
> > > > In other cases, we can remember the subscriptions being dropped and
> > > > send the message to drop the statistics of them after committing the
> > > > transaction but I’m not sure it’s worth having it.
> > > >
> > >
> > > Yeah, let's not go to that extent. I think in most cases subscriptions
> > > will have corresponding slots.
> >
> > Agreed.
> >
> > >
> > >  FWIW, we completely
> > > > rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
> > > > functions. And we don't expect there are many subscriptions on the
> > > > database.
> > > >
> > >
> > > True, but we do send it for the database, so let's do it for the cases
> > > you explained in the first paragraph.
> >
> > Agreed.
> >
> > I've attached a new version patch. Since the syntax of skipping
> > transaction id is under the discussion I've attached only the error
> > reporting patch for now.
>
> Thanks for the updated patch, few comments:
> 1) This check and return can be moved above CreateTemplateTupleDesc so
> that the tuple descriptor need not be created if there is no worker
> statistics
> +       BlessTupleDesc(tupdesc);
> +
> +       /* Get subscription worker stats */
> +       wentry = pgstat_fetch_subworker(subid, subrelid);
> +
> +       /* Return NULL if there is no worker statistics */
> +       if (wentry == NULL)
> +               PG_RETURN_NULL();
> +
> +       /* Initialise values and NULL flags arrays */
> +       MemSet(values, 0, sizeof(values));
> +       MemSet(nulls, 0, sizeof(nulls));
>
> 2) "NULL for the main apply worker" is mentioned as "null for the main
> apply worker" in case of pg_stat_subscription view, we can mention it
> similarly.
> +      <para>
> +       OID of the relation that the worker is synchronizing; NULL for the
> +       main apply worker
> +      </para></entry>
>
> 3) Variable assignment can be done during declaration and this the
> assignment can be removed
> +       i = 0;
> +       /* subid */
> +       values[i++] = ObjectIdGetDatum(subid);
>
> 4) I noticed that the worker error is still present when queried from
> pg_stat_subscription_workers even after conflict is resolved in the
> subscriber and the worker proceeds with applying the other
> transactions, should this be documented somewhere?
>
> 5) This needs to be aligned, the columns in select have used TAB, we
> should align it using spaces.
> +CREATE VIEW pg_stat_subscription_workers AS
> +    SELECT
> +       w.subid,
> +       s.subname,
> +       w.subrelid,
> +       w.relid,
> +       w.command,
> +       w.xid,
> +       w.error_count,
> +       w.error_message,
> +       w.last_error_time,
> +       w.stats_reset
>

Thank you for the comments! These comments are incorporated into the
latest (v20) patch I just submitted[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoAT42mhcqeB1jPfRL1%2BEUHbZk8MMY_fBgsyZvJeKNpG%2Bw%40mail.gmail.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

08 ноября 2021 г., 10:10:30

On Mon, Nov 8, 2021 at 1:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached an updated patch. In this version patch, subscription
> worker statistics are collected per-database and handled in a similar
> way to tables and functions. I think perhaps we still need to discuss
> details of how the statistics should be handled but I'd like to share
> the patch for discussion.
>

That's for the updated patch.
Some initial comments on the v20 patch:

doc/src/sgml/monitoring.sgml

(1) wording
The word "information" seems to be missing after "showing" (otherwise
is reads "showing about errors", which isn't correct grammar).
I suggest the following change:

BEFORE:
+      <entry>At least one row per subscription, showing about errors that
+      occurred on subscription.
AFTER:
+      <entry>At least one row per subscription, showing information about
+      errors that occurred on subscription.

(2) pg_stat_reset_subscription_worker(subid Oid, relid Oid) function
documentation
The description doesn't read well. I'd suggest the following change:

BEFORE:
* Resets statistics of a single subscription worker statistics.
AFTER:
* Resets the statistics of a single subscription worker.

I think that the documentation for this function should make it clear
that a non-NULL "subid" parameter is required for both reset cases
(tablesync and apply).
Perhaps this could be done by simply changing the first sentence to say:
"Resets the statistics of a single subscription worker, for a worker
running on the subscription with <parameter>subid</parameter>."
(and then can remove " running on the subscription with
<parameter>subid</parameter>" from the last sentence)

I think that the documentation for this function should say that it
should be used in conjunction with the "pg_stat_subscription_workers"
view in order to obtain the required subid/relid values for resetting.
(and should provide a link to the documentation for that view)

Also, I think that the function documentation should make it clear how
to distinguish the tablesync vs apply worker statistics case.
e.g. the tablesync error case is indicated by a null "command" in the
information returned from the "pg_stat_subscription_workers" view
(otherwise it seems a user could only know this by looking at the server log).

Finally, there are currently no tests for this new function.

(3) pg_stat_subscription_workers
In the documentation for this, some users may not realise that "the
initial data copy" refers to "tablesync", so maybe say "the initial
data copy (tablesync)", or similar.

(4) stats_reset
"stats_reset" is currently documented as the last column of the
"pg_stat_subscription_workers" view - but it's actually no longer
included in the view.

(5) src/tools/pgindent/typedefs.list
The following current entries are bogus:
PgStat_MsgSubWorkerErrorPurge
PgStat_MsgSubWorkerPurge

The following entry is missing:
PgStat_MsgSubscriptionPurge

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Dilip Kumar

Дата:

09 ноября 2021 г., 09:07:14

On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've attached an updated patch. In this version patch, subscription
> worker statistics are collected per-database and handled in a similar
> way to tables and functions. I think perhaps we still need to discuss
> details of how the statistics should be handled but I'd like to share
> the patch for discussion.

While reviewing the v20, I have some initial comments,

+     <row>
+
<entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
+      <entry>At least one row per subscription, showing about errors that
+      occurred on subscription.
+      See <link linkend="monitoring-pg-stat-subscription-workers">
+      <structname>pg_stat_subscription_workers</structname></link> for details.
+      </entry>

1.
I don't like the fact that this view is very specific for showing the
errors but the name of the view is very generic.  So are we keeping
this name to expand the scope of the view in the future?  If this is
meant only for showing the errors then the name should be more
specific.

2.
Why comment says "At least one row per subscription"? this looks
confusing, I mean if there is no error then there will not be even one
row right?

+  <para>
+   The <structname>pg_stat_subscription_workers</structname> view will contain
+   one row per subscription error reported by workers applying logical
+   replication changes and workers handling the initial data copy of the
+   subscribed tables.
+  </para>

3.
So there will only be one row per subscription?  I did not read the
code, but suppose there was an error due to some constraint now if
that constraint is removed and there is a new error then the old error
will be removed immediately or it will be removed by auto vacuum?  If
it is not removed immediately then there could be multiple errors per
subscription in the view so the comment is not correct.

4.
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>last_error_time</structfield> <type>timestamp
with time zone</type>
+      </para>
+      <para>
+       Time at which the last error occurred
+      </para></entry>
+     </row>

Will it be useful to know when the first time error occurred?

5.
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>stats_reset</structfield> <type>timestamp with
time zone</type>
+      </para>
+      <para>

The actual view does not contain this column.

6.
+       <para>
+        Resets statistics of a single subscription worker statistics.

/Resets statistics of a single subscription worker statistics/Resets
statistics of a single subscription worker

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

09 ноября 2021 г., 09:07:48

On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Nov 3, 2021 at 12:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > If we follow the idea of keeping stats at db level (in
> > PgStat_StatDBEntry) as discussed above then I think we already have a
> > way to remove stat entries via pg_stat_reset which removes the stats
> > corresponding to tables, functions and after this patch corresponding
> > to subscriptions as well for the current database. Won't that be
> > sufficient? I see your point but I think it may be better if we keep
> > the same behavior for stats of apply and table sync workers.
>
> Make sense.
>

We can document this point.

> >
> > Following the tables, functions, I thought of keeping the name of the
> > reset function similar to "pg_stat_reset_single_table_counters" but I
> > feel the currently used name "pg_stat_reset_subscription_worker" in
> > the patch is better. Do let me know what you think?
>
> Yeah, I also tend to prefer pg_stat_reset_subscription_worker name
> since "single" isn't clear in the context of subscription worker.  And
> the behavior of the reset function for subscription workers is also
> different from pg_stat_reset_single_xxx_counters.
>
> I've attached an updated patch. In this version patch, subscription
> worker statistics are collected per-database and handled in a similar
> way to tables and functions. I think perhaps we still need to discuss
> details of how the statistics should be handled but I'd like to share
> the patch for discussion.
>

Do you have something specific in mind to discuss the details of how
stats should be handled?

Few comments/questions:
====================
1.
 static void pgstat_reset_replslot(PgStat_StatReplSlotEntry
*slotstats, TimestampTz ts);

+
 static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);

Spurious line addition.

2. Why now there is no code to deal with dead table sync entries as
compared to previous version of patch?

3. Why do we need two different functions
pg_stat_reset_subscription_worker_sub and
pg_stat_reset_subscription_worker_subrel to handle reset? Isn't it
sufficient to reset all entries for a subscription if relid is
InvalidOid?

4. It seems now stats_reset entry is not present in
pg_stat_subscription_workers? How will users find that information if
required?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

09 ноября 2021 г., 09:27:10

On Tue, Nov 9, 2021 at 11:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached an updated patch. In this version patch, subscription
> > worker statistics are collected per-database and handled in a similar
> > way to tables and functions. I think perhaps we still need to discuss
> > details of how the statistics should be handled but I'd like to share
> > the patch for discussion.
>
> While reviewing the v20, I have some initial comments,
>
> +     <row>
> +
<entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
> +      <entry>At least one row per subscription, showing about errors that
> +      occurred on subscription.
> +      See <link linkend="monitoring-pg-stat-subscription-workers">
> +      <structname>pg_stat_subscription_workers</structname></link> for details.
> +      </entry>
>
> 1.
> I don't like the fact that this view is very specific for showing the
> errors but the name of the view is very generic.  So are we keeping
> this name to expand the scope of the view in the future?
>

Yes, we are planning to display some other xact specific stats as well
corresponding to subscription workers. See [1][2].

[1] -
https://www.postgresql.org/message-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199%40OSBPR01MB4888.jpnprd01.prod.outlook.com
[2] - https://www.postgresql.org/message-id/CAA4eK1%2B1n3upCMB-Y_k9b1wPNCtNE7MEHan9kA1s6GNsZGB0Og%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

09 ноября 2021 г., 09:43:11

On Tue, Nov 9, 2021 at 3:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Nov 3, 2021 at 12:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > If we follow the idea of keeping stats at db level (in
> > > PgStat_StatDBEntry) as discussed above then I think we already have a
> > > way to remove stat entries via pg_stat_reset which removes the stats
> > > corresponding to tables, functions and after this patch corresponding
> > > to subscriptions as well for the current database. Won't that be
> > > sufficient? I see your point but I think it may be better if we keep
> > > the same behavior for stats of apply and table sync workers.
> >
> > Make sense.
> >
>
> We can document this point.

Okay.

>
> > >
> > > Following the tables, functions, I thought of keeping the name of the
> > > reset function similar to "pg_stat_reset_single_table_counters" but I
> > > feel the currently used name "pg_stat_reset_subscription_worker" in
> > > the patch is better. Do let me know what you think?
> >
> > Yeah, I also tend to prefer pg_stat_reset_subscription_worker name
> > since "single" isn't clear in the context of subscription worker.  And
> > the behavior of the reset function for subscription workers is also
> > different from pg_stat_reset_single_xxx_counters.
> >
> > I've attached an updated patch. In this version patch, subscription
> > worker statistics are collected per-database and handled in a similar
> > way to tables and functions. I think perhaps we still need to discuss
> > details of how the statistics should be handled but I'd like to share
> > the patch for discussion.
> >
>
> Do you have something specific in mind to discuss the details of how
> stats should be handled?

As you commented, I removed stats_reset column from
pg_stat_subscription_workers view since tables and functions stats
view doesn't have it.

>
> Few comments/questions:
> ====================
> 1.
>  static void pgstat_reset_replslot(PgStat_StatReplSlotEntry
> *slotstats, TimestampTz ts);
>
> +
>  static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
>
> Spurious line addition.

Will fix.

>
> 2. Why now there is no code to deal with dead table sync entries as
> compared to previous version of patch?

I think we discussed that it's better if we keep the same behavior for
stats of apply and table sync workers. So the table sync entries are
dead after the subscription is dropped, like apply entries. No?


>
> 3. Why do we need two different functions
> pg_stat_reset_subscription_worker_sub and
> pg_stat_reset_subscription_worker_subrel to handle reset? Isn't it
> sufficient to reset all entries for a subscription if relid is
> InvalidOid?

Since setting InvalidOid to relid means an apply entry we cannot use
it for that purpose.

>
> 4. It seems now stats_reset entry is not present in
> pg_stat_subscription_workers? How will users find that information if
> required?

Users can find it in pg_stat_databases. The same is true for table and
function statistics -- they don't have stats_reset column but reset
stats_reset of its entry on pg_stat_database.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Dilip Kumar

Дата:

09 ноября 2021 г., 10:40:04

On Tue, Nov 9, 2021 at 11:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 9, 2021 at 11:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > 1.
> > I don't like the fact that this view is very specific for showing the
> > errors but the name of the view is very generic.  So are we keeping
> > this name to expand the scope of the view in the future?
> >
>
> Yes, we are planning to display some other xact specific stats as well
> corresponding to subscription workers. See [1][2].
>
> [1] -
https://www.postgresql.org/message-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199%40OSBPR01MB4888.jpnprd01.prod.outlook.com
> [2] - https://www.postgresql.org/message-id/CAA4eK1%2B1n3upCMB-Y_k9b1wPNCtNE7MEHan9kA1s6GNsZGB0Og%40mail.gmail.com

Thanks for pointing me to this thread, I will have a look.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

09 ноября 2021 г., 13:03:50

On Tue, Nov 9, 2021 at 12:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Nov 9, 2021 at 3:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > 4. It seems now stats_reset entry is not present in
> > pg_stat_subscription_workers? How will users find that information if
> > required?
>
> Users can find it in pg_stat_databases. The same is true for table and
> function statistics -- they don't have stats_reset column but reset
> stats_reset of its entry on pg_stat_database.
>

Okay, but isn't it better to deal with the reset of subscription
workers via pgstat_recv_resetsinglecounter by introducing subobjectid?
I think that will make code consistent for all database-related stats.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

09 ноября 2021 г., 15:55:34

On Tue, Nov 9, 2021 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 9, 2021 at 12:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Nov 9, 2021 at 3:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > 4. It seems now stats_reset entry is not present in
> > > pg_stat_subscription_workers? How will users find that information if
> > > required?
> >
> > Users can find it in pg_stat_databases. The same is true for table and
> > function statistics -- they don't have stats_reset column but reset
> > stats_reset of its entry on pg_stat_database.
> >
>
> Okay, but isn't it better to deal with the reset of subscription
> workers via pgstat_recv_resetsinglecounter by introducing subobjectid?
> I think that will make code consistent for all database-related stats.
>

Agreed. It's better to use the same function internally even if the
SQL-callable interfaces are different.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

09 ноября 2021 г., 16:31:33

On Tue, Nov 9, 2021 at 1:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Nov 9, 2021 at 11:57 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Nov 9, 2021 at 11:37 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > > 1.
> > > I don't like the fact that this view is very specific for showing the
> > > errors but the name of the view is very generic.  So are we keeping
> > > this name to expand the scope of the view in the future?
> > >
> >
> > Yes, we are planning to display some other xact specific stats as well
> > corresponding to subscription workers. See [1][2].
> >
> > [1] -
https://www.postgresql.org/message-id/OSBPR01MB48887CA8F40C8D984A6DC00CED199%40OSBPR01MB4888.jpnprd01.prod.outlook.com
> > [2] - https://www.postgresql.org/message-id/CAA4eK1%2B1n3upCMB-Y_k9b1wPNCtNE7MEHan9kA1s6GNsZGB0Og%40mail.gmail.com
>
> Thanks for pointing me to this thread, I will have a look.
>

I think we can even add a line in the commit message stating that this
can be extended in the future to track other xact related stats for
subscription workers. I think it will help readers of the patch.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

10 ноября 2021 г., 06:49:30

On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Nov 3, 2021 at 12:41 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Nov 2, 2021 at 2:17 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Nov 2, 2021 at 2:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Nov 1, 2021 at 7:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Fri, Oct 29, 2021 at 8:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > >
> > > > > > Fair enough. So statistics can be removed either by vacuum or drop
> > > > > > subscription. Also, if we go by this logic then there is no harm in
> > > > > > retaining the stat entries for tablesync errors. Why have different
> > > > > > behavior for apply and tablesync workers?
> > > > >
> > > > > My understanding is that the subscription worker statistics entry
> > > > > corresponds to workers (but not physical workers since the physical
> > > > > process is changed after restarting). So if the worker finishes its
> > > > > jobs, it is no longer necessary to show errors since further problems
> > > > > will not occur after that. Table sync worker’s job finishes when
> > > > > completing table copy (unless table sync is performed again by REFRESH
> > > > > PUBLICATION) whereas apply worker’s job finishes when the subscription
> > > > > is dropped.
> > > > >
> > > >
> > > > Actually, I am not very sure how users can use the old error
> > > > information after we allowed skipping the conflicting xid. Say, if
> > > > they want to add/remove some constraints on the table based on
> > > > previous errors then they might want to refer to errors of both the
> > > > apply worker and table sync worker.
> > >
> > > I think that in general, statistics should be retained as long as a
> > > corresponding object exists on the database, like other cumulative
> > > statistic views. So I’m concerned that an entry of a cumulative stats
> > > view is automatically removed by a non-stats-related function (i.g.,
> > > ALTER SUBSCRIPTION SKIP). Which seems a new behavior for cumulative
> > > stats views.
> > >
> > > We can retain the stats entries for table sync worker but what I want
> > > to avoid is that the view shows many old entries that will never be
> > > updated. I've sometimes seen cases where the user mistakenly restored
> > > table data on the subscriber before creating a subscription, failed
> > > table sync on many tables due to unique violation, and truncated
> > > tables on the subscriber. I think that unlike the stats entries for
> > > apply worker, retaining the stats entries for table sync could be
> > > harmful since it’s likely to be a large amount (even hundreds of
> > > entries). Especially, it could lead to bloat the stats file since it
> > > has an error message. So if we do that, I'd like to provide a function
> > > for users to remove (not reset) stats entries manually.
> > >
> >
> > If we follow the idea of keeping stats at db level (in
> > PgStat_StatDBEntry) as discussed above then I think we already have a
> > way to remove stat entries via pg_stat_reset which removes the stats
> > corresponding to tables, functions and after this patch corresponding
> > to subscriptions as well for the current database. Won't that be
> > sufficient? I see your point but I think it may be better if we keep
> > the same behavior for stats of apply and table sync workers.
>
> Make sense.
>
> >
> > Following the tables, functions, I thought of keeping the name of the
> > reset function similar to "pg_stat_reset_single_table_counters" but I
> > feel the currently used name "pg_stat_reset_subscription_worker" in
> > the patch is better. Do let me know what you think?
>
> Yeah, I also tend to prefer pg_stat_reset_subscription_worker name
> since "single" isn't clear in the context of subscription worker.  And
> the behavior of the reset function for subscription workers is also
> different from pg_stat_reset_single_xxx_counters.
>
> I've attached an updated patch. In this version patch, subscription
> worker statistics are collected per-database and handled in a similar
> way to tables and functions. I think perhaps we still need to discuss
> details of how the statistics should be handled but I'd like to share
> the patch for discussion.

Thanks for the updated patch, Few comments:
1) should we change "Tables and functions hashes are initialized to
empty" to "Tables, functions and subworker hashes are initialized to
empty"
+       hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
+       hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
+       dbentry->subworkers = hash_create("Per-database subscription worker",
+
   PGSTAT_SUBWORKER_HASH_SIZE,
+
   &hash_ctl,
+
   HASH_ELEM | HASH_BLOBS);

2) Since databaseid, tabhash, funchash and subworkerhash are members
of dbentry, can we remove the function arguments databaseid, tabhash,
funchash and subworkerhash and pass dbentry similar to
pgstat_write_db_statsfile function?
@@ -4370,12 +4582,14 @@ done:
  */
 static void
 pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
-                                                bool permanent)
+                                                HTAB *subworkerhash,
bool permanent)
 {
        PgStat_StatTabEntry *tabentry;
        PgStat_StatTabEntry tabbuf;
        PgStat_StatFuncEntry funcbuf;
        PgStat_StatFuncEntry *funcentry;
+       PgStat_StatSubWorkerEntry subwbuf;
+       PgStat_StatSubWorkerEntry *subwentry;

3) Can we move pgstat_get_subworker_entry below pgstat_get_db_entry
and pgstat_get_tab_entry, so that the hash lookup can be together
consistently. Similarly pgstat_send_subscription_purge can be moved
after pgstat_send_slru.
+/* ----------
+ * pgstat_get_subworker_entry
+ *
+ * Return subscription worker entry with the given subscription OID and
+ * relation OID.  If subrelid is InvalidOid, it returns an entry of the
+ * apply worker otherwise of the table sync worker associated with subrelid.
+ * If no subscription entry exists, initialize it, if the create parameter
+ * is true.  Else, return NULL.
+ * ----------
+ */
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid,
Oid subrelid,
+                                                  bool create)
+{
+       PgStat_StatSubWorkerEntry *subwentry;
+       PgStat_StatSubWorkerKey key;
+       bool            found;

4) This change can be removed from pgstat.c:
@@ -332,9 +339,11 @@ static bool pgstat_db_requested(Oid databaseid);
 static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData
name, bool create_it);
 static void pgstat_reset_replslot(PgStat_StatReplSlotEntry
*slotstats, TimestampTz ts);

+
 static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
 static void pgstat_send_funcstats(void);

5) I was able to compile without including
catalog/pg_subscription_rel.h, we can remove including
catalog/pg_subscription_rel.h if not required.
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -41,6 +41,8 @@
 #include "catalog/catalog.h"
 #include "catalog/pg_database.h"
 #include "catalog/pg_proc.h"
+#include "catalog/pg_subscription.h"
+#include "catalog/pg_subscription_rel.h"

 6) Similarly replication/logicalproto.h also need not be included
 --- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -24,6 +24,7 @@
 #include "pgstat.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/postmaster.h"
+#include "replication/logicalproto.h"
 #include "replication/slot.h"
 #include "storage/proc.h"

7) There is an extra ";", We can remove one ";" from below:
+       PgStat_StatSubWorkerKey key;
+       bool            found;
+       HASHACTION      action = (create ? HASH_ENTER : HASH_FIND);;
+
+       key.subid = subid;
+       key.subrelid = subrelid;

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 ноября 2021 г., 04:37:36

On Mon, Nov 8, 2021 at 4:10 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Nov 8, 2021 at 1:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated patch. In this version patch, subscription
> > worker statistics are collected per-database and handled in a similar
> > way to tables and functions. I think perhaps we still need to discuss
> > details of how the statistics should be handled but I'd like to share
> > the patch for discussion.
> >
>
> That's for the updated patch.
> Some initial comments on the v20 patch:

Thank you for the comments!

>
>
> doc/src/sgml/monitoring.sgml
>
> (1) wording
> The word "information" seems to be missing after "showing" (otherwise
> is reads "showing about errors", which isn't correct grammar).
> I suggest the following change:
>
> BEFORE:
> +      <entry>At least one row per subscription, showing about errors that
> +      occurred on subscription.
> AFTER:
> +      <entry>At least one row per subscription, showing information about
> +      errors that occurred on subscription.

Fixed.

>
> (2) pg_stat_reset_subscription_worker(subid Oid, relid Oid) function
> documentation
> The description doesn't read well. I'd suggest the following change:
>
> BEFORE:
> * Resets statistics of a single subscription worker statistics.
> AFTER:
> * Resets the statistics of a single subscription worker.
>
> I think that the documentation for this function should make it clear
> that a non-NULL "subid" parameter is required for both reset cases
> (tablesync and apply).
> Perhaps this could be done by simply changing the first sentence to say:
> "Resets the statistics of a single subscription worker, for a worker
> running on the subscription with <parameter>subid</parameter>."
> (and then can remove " running on the subscription with
> <parameter>subid</parameter>" from the last sentence)

Fixed.

>
> I think that the documentation for this function should say that it
> should be used in conjunction with the "pg_stat_subscription_workers"
> view in order to obtain the required subid/relid values for resetting.
> (and should provide a link to the documentation for that view)

I think it's not necessarily true that users should use
pg_stat_subscription_workers in order to obtain subid/relid since we
can obtain the same also from pg_subscription_rel. But I agree that it
should clarify that this function resets entries of
pg_stat_subscription view. Fixed.

>
> Also, I think that the function documentation should make it clear how
> to distinguish the tablesync vs apply worker statistics case.
> e.g. the tablesync error case is indicated by a null "command" in the
> information returned from the "pg_stat_subscription_workers" view
> (otherwise it seems a user could only know this by looking at the server log).

The documentation of pg_stat_subscription_workers explains that
subrelid is always NULL for apply workers. Is it not enough?

>
> Finally, there are currently no tests for this new function.

I've added some tests.

>
> (3) pg_stat_subscription_workers
> In the documentation for this, some users may not realise that "the
> initial data copy" refers to "tablesync", so maybe say "the initial
> data copy (tablesync)", or similar.
>

Perhaps it's better not to use the term "tablesync" since we don't use
the term anywhere now. Instead, we should say more clearly, say
"subscription worker handling initial data copy of the relation, as
the description pg_stat_subscription says.

> (4) stats_reset
> "stats_reset" is currently documented as the last column of the
> "pg_stat_subscription_workers" view - but it's actually no longer
> included in the view.

Removed.

>
> (5) src/tools/pgindent/typedefs.list
> The following current entries are bogus:
> PgStat_MsgSubWorkerErrorPurge
> PgStat_MsgSubWorkerPurge
>
> The following entry is missing:
> PgStat_MsgSubscriptionPurge

Fixed.

I'll submit an updated patch soon.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 ноября 2021 г., 04:38:26

On Tue, Nov 9, 2021 at 3:07 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Sun, Nov 7, 2021 at 7:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached an updated patch. In this version patch, subscription
> > worker statistics are collected per-database and handled in a similar
> > way to tables and functions. I think perhaps we still need to discuss
> > details of how the statistics should be handled but I'd like to share
> > the patch for discussion.
>
> While reviewing the v20, I have some initial comments,
>
> +     <row>
> +
<entry><structname>pg_stat_subscription_workers</structname><indexterm><primary>pg_stat_subscription_workers</primary></indexterm></entry>
> +      <entry>At least one row per subscription, showing about errors that
> +      occurred on subscription.
> +      See <link linkend="monitoring-pg-stat-subscription-workers">
> +      <structname>pg_stat_subscription_workers</structname></link> for details.
> +      </entry>
>
> 1.
> I don't like the fact that this view is very specific for showing the
> errors but the name of the view is very generic.  So are we keeping
> this name to expand the scope of the view in the future?  If this is
> meant only for showing the errors then the name should be more
> specific.

As Amit already mentioned, we're planning to add more xact statistics
to this view. I've mentioned that in the commit message.

>
> 2.
> Why comment says "At least one row per subscription"? this looks
> confusing, I mean if there is no error then there will not be even one
> row right?
>
>
> +  <para>
> +   The <structname>pg_stat_subscription_workers</structname> view will contain
> +   one row per subscription error reported by workers applying logical
> +   replication changes and workers handling the initial data copy of the
> +   subscribed tables.
> +  </para>

Right. Fixed.

>
> 3.
> So there will only be one row per subscription?  I did not read the
> code, but suppose there was an error due to some constraint now if
> that constraint is removed and there is a new error then the old error
> will be removed immediately or it will be removed by auto vacuum?  If
> it is not removed immediately then there could be multiple errors per
> subscription in the view so the comment is not correct.

There is one row per subscription worker (apply worker and tablesync
worker). If the same error consecutively occurred, error_count is
incremented and last_error_time is updated. Otherwise, i.g., if a
different error occurred on the apply worker, all statistics are
updated.

>
> 4.
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>last_error_time</structfield> <type>timestamp
> with time zone</type>
> +      </para>
> +      <para>
> +       Time at which the last error occurred
> +      </para></entry>
> +     </row>
>
> Will it be useful to know when the first time error occurred?

Good idea. Users can know when the subscription stopped due to this
error. Added.

>
> 5.
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>stats_reset</structfield> <type>timestamp with
> time zone</type>
> +      </para>
> +      <para>
>
> The actual view does not contain this column.

Removed.

>
> 6.
> +       <para>
> +        Resets statistics of a single subscription worker statistics.
>
> /Resets statistics of a single subscription worker statistics/Resets
> statistics of a single subscription worker

Fixed.

I'll update an updated patch soon.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 ноября 2021 г., 05:48:34

On Wed, Nov 10, 2021 at 12:49 PM vignesh C <vignesh21@gmail.com> wrote:
>
>
> Thanks for the updated patch, Few comments:

Thank you for the comments!

> 1) should we change "Tables and functions hashes are initialized to
> empty" to "Tables, functions and subworker hashes are initialized to
> empty"
> +       hash_ctl.keysize = sizeof(PgStat_StatSubWorkerKey);
> +       hash_ctl.entrysize = sizeof(PgStat_StatSubWorkerEntry);
> +       dbentry->subworkers = hash_create("Per-database subscription worker",
> +
>    PGSTAT_SUBWORKER_HASH_SIZE,
> +
>    &hash_ctl,
> +
>    HASH_ELEM | HASH_BLOBS);

Fixed.

>
> 2) Since databaseid, tabhash, funchash and subworkerhash are members
> of dbentry, can we remove the function arguments databaseid, tabhash,
> funchash and subworkerhash and pass dbentry similar to
> pgstat_write_db_statsfile function?
> @@ -4370,12 +4582,14 @@ done:
>   */
>  static void
>  pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB *funchash,
> -                                                bool permanent)
> +                                                HTAB *subworkerhash,
> bool permanent)
>  {
>         PgStat_StatTabEntry *tabentry;
>         PgStat_StatTabEntry tabbuf;
>         PgStat_StatFuncEntry funcbuf;
>         PgStat_StatFuncEntry *funcentry;
> +       PgStat_StatSubWorkerEntry subwbuf;
> +       PgStat_StatSubWorkerEntry *subwentry;
>

As the comment of this function says, this function has the ability to
skip storing per-table or per-function (and or
per-subscription-workers) data, if NULL is passed for the
corresponding hashtable, although that's not used at the moment. IMO
it'd be better to keep such behavior.

> 3) Can we move pgstat_get_subworker_entry below pgstat_get_db_entry
> and pgstat_get_tab_entry, so that the hash lookup can be together
> consistently. Similarly pgstat_send_subscription_purge can be moved
> after pgstat_send_slru.
> +/* ----------
> + * pgstat_get_subworker_entry
> + *
> + * Return subscription worker entry with the given subscription OID and
> + * relation OID.  If subrelid is InvalidOid, it returns an entry of the
> + * apply worker otherwise of the table sync worker associated with subrelid.
> + * If no subscription entry exists, initialize it, if the create parameter
> + * is true.  Else, return NULL.
> + * ----------
> + */
> +static PgStat_StatSubWorkerEntry *
> +pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid,
> Oid subrelid,
> +                                                  bool create)
> +{
> +       PgStat_StatSubWorkerEntry *subwentry;
> +       PgStat_StatSubWorkerKey key;
> +       bool            found;

Agreed. Moved.

>
> 4) This change can be removed from pgstat.c:
> @@ -332,9 +339,11 @@ static bool pgstat_db_requested(Oid databaseid);
>  static PgStat_StatReplSlotEntry *pgstat_get_replslot_entry(NameData
> name, bool create_it);
>  static void pgstat_reset_replslot(PgStat_StatReplSlotEntry
> *slotstats, TimestampTz ts);
>
> +
>  static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg, TimestampTz now);
>  static void pgstat_send_funcstats(void);

Removed.

>
> 5) I was able to compile without including
> catalog/pg_subscription_rel.h, we can remove including
> catalog/pg_subscription_rel.h if not required.
> --- a/src/backend/postmaster/pgstat.c
> +++ b/src/backend/postmaster/pgstat.c
> @@ -41,6 +41,8 @@
>  #include "catalog/catalog.h"
>  #include "catalog/pg_database.h"
>  #include "catalog/pg_proc.h"
> +#include "catalog/pg_subscription.h"
> +#include "catalog/pg_subscription_rel.h"

Removed.

>
>  6) Similarly replication/logicalproto.h also need not be included
>  --- a/src/backend/utils/adt/pgstatfuncs.c
> +++ b/src/backend/utils/adt/pgstatfuncs.c
> @@ -24,6 +24,7 @@
>  #include "pgstat.h"
>  #include "postmaster/bgworker_internals.h"
>  #include "postmaster/postmaster.h"
> +#include "replication/logicalproto.h"
>  #include "replication/slot.h"
>  #include "storage/proc.h"

Removed;

>
> 7) There is an extra ";", We can remove one ";" from below:
> +       PgStat_StatSubWorkerKey key;
> +       bool            found;
> +       HASHACTION      action = (create ? HASH_ENTER : HASH_FIND);;
> +
> +       key.subid = subid;
> +       key.subrelid = subrelid;

Fixed.

I've attached an updated patch that incorporates all comments I got so
far. Please review it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v21-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

15 ноября 2021 г., 10:49:05

On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached an updated patch that incorporates all comments I got so
> far. Please review it.
>

Thanks for the updated patch.
A few minor comments:

doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml

(1) tab in doc updates

There's a tab before "Otherwise,":

+        copy of the relation with <parameter>relid</parameter>.
        Otherwise,

src/backend/utils/adt/pgstatfuncs.c

(2) The function comment for "pg_stat_reset_subscription_worker_sub"
seems a bit long and I expected it to be multi-line (did you run
pg_indent?)

src/include/pgstat.h

(3) Remove PgStat_StatSubWorkerEntry.dbid?

The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
seem to be used, so I think it should be removed.
(I could remove it and everything builds OK and tests pass).


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 ноября 2021 г., 12:17:34

On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated patch that incorporates all comments I got so
> > far. Please review it.
> >
>
> Thanks for the updated patch.
> A few minor comments:
>
> doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
>
> (1) tab in doc updates
>
> There's a tab before "Otherwise,":
>
> +        copy of the relation with <parameter>relid</parameter>.
>         Otherwise,

Fixed.

>
> src/backend/utils/adt/pgstatfuncs.c
>
> (2) The function comment for "pg_stat_reset_subscription_worker_sub"
> seems a bit long and I expected it to be multi-line (did you run
> pg_indent?)

I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.

>
> src/include/pgstat.h
>
> (3) Remove PgStat_StatSubWorkerEntry.dbid?
>
> The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
> seem to be used, so I think it should be removed.
> (I could remove it and everything builds OK and tests pass).
>

Fixed.

Thank you for the comments! I've updated an updated version patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v22-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

15 ноября 2021 г., 17:43:14

On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached an updated patch that incorporates all comments I got so
> > > far. Please review it.
> > >
> >
> > Thanks for the updated patch.
> > A few minor comments:
> >
> > doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
> >
> > (1) tab in doc updates
> >
> > There's a tab before "Otherwise,":
> >
> > +        copy of the relation with <parameter>relid</parameter>.
> >         Otherwise,
>
> Fixed.
>
> >
> > src/backend/utils/adt/pgstatfuncs.c
> >
> > (2) The function comment for "pg_stat_reset_subscription_worker_sub"
> > seems a bit long and I expected it to be multi-line (did you run
> > pg_indent?)
>
> I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.
>
> >
> > src/include/pgstat.h
> >
> > (3) Remove PgStat_StatSubWorkerEntry.dbid?
> >
> > The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
> > seem to be used, so I think it should be removed.
> > (I could remove it and everything builds OK and tests pass).
> >
>
> Fixed.
>
> Thank you for the comments! I've updated an updated version patch.

Thanks for the updated patch.
I found one issue:
This Assert can fail in few cases:
+void
+pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
+
LogicalRepMsgType command, TransactionId xid,
+                                                         const char *errmsg)
+{
+       PgStat_MsgSubWorkerError msg;
+       int                     len;
+
+       Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
+       len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) +
strlen(errmsg) + 1;
+

I could reproduce the problem with the following scenario:
Publisher:
create table t1 (c1 varchar);
create publication pub1 for table t1;
insert into t1 values(repeat('abcd', 5000));

Subscriber:
create table t1(c1 smallint);
create subscription sub1 connection 'dbname=postgres port=5432'
publication pub1 with ( two_phase = true);
postgres=# select * from pg_stat_subscription_workers;
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process
exited abnormally and possibly corrupted shared memory.
HINT:  In a moment you should be able to reconnect to the database and
repeat your command.
server closed the connection unexpectedly
   This probably means the server terminated abnormally
   before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

Subscriber logs:
2021-11-15 19:27:56.380 IST [15685] LOG:  logical replication apply
worker for subscription "sub1" has started
2021-11-15 19:27:56.384 IST [15687] LOG:  logical replication table
synchronization worker for subscription "sub1", table "t1" has started
TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
File: "pgstat.c", Line: 1946, PID: 15687)
postgres: logical replication worker for subscription 16387 sync 16384
(ExceptionalCondition+0xd0)[0x55a18f3c727f]
postgres: logical replication worker for subscription 16387 sync 16384
(pgstat_report_subworker_error+0x7a)[0x55a18f126417]
postgres: logical replication worker for subscription 16387 sync 16384
(ApplyWorkerMain+0x493)[0x55a18f176611]
postgres: logical replication worker for subscription 16387 sync 16384
(StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54efc0)[0x55a18f134fc0]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54f3af)[0x55a18f1353af]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x54e338)[0x55a18f134338]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
/lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x5498c2)[0x55a18f12f8c2]
postgres: logical replication worker for subscription 16387 sync 16384
(PostmasterMain+0x134c)[0x55a18f12f1dd]
postgres: logical replication worker for subscription 16387 sync 16384
(+0x43c3d4)[0x55a18f0223d4]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565]
postgres: logical replication worker for subscription 16387 sync 16384
(_start+0x2e)[0x55a18ecaf4fe]
2021-11-15 19:27:56.483 IST [15645] LOG:  background worker "logical
replication worker" (PID 15687) was terminated by signal 6: Aborted
2021-11-15 19:27:56.483 IST [15645] LOG:  terminating any other active
server processes
2021-11-15 19:27:56.485 IST [15645] LOG:  all server processes
terminated; reinitializing

Here it fails because of a long error message ""invalid input syntax
for type smallint:

\"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...."
because we try to insert varchar type data into smallint type.  Maybe
we should trim the error message in this case.

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

16 ноября 2021 г., 09:31:18

On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > >
> > > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I've attached an updated patch that incorporates all comments I got so
> > > > far. Please review it.
> > > >
> > >
> > > Thanks for the updated patch.
> > > A few minor comments:
> > >
> > > doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
> > >
> > > (1) tab in doc updates
> > >
> > > There's a tab before "Otherwise,":
> > >
> > > +        copy of the relation with <parameter>relid</parameter>.
> > >         Otherwise,
> >
> > Fixed.
> >
> > >
> > > src/backend/utils/adt/pgstatfuncs.c
> > >
> > > (2) The function comment for "pg_stat_reset_subscription_worker_sub"
> > > seems a bit long and I expected it to be multi-line (did you run
> > > pg_indent?)
> >
> > I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.
> >
> > >
> > > src/include/pgstat.h
> > >
> > > (3) Remove PgStat_StatSubWorkerEntry.dbid?
> > >
> > > The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
> > > seem to be used, so I think it should be removed.
> > > (I could remove it and everything builds OK and tests pass).
> > >
> >
> > Fixed.
> >
> > Thank you for the comments! I've updated an updated version patch.
>
> Thanks for the updated patch.
> I found one issue:
> This Assert can fail in few cases:
> +void
> +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
> +
> LogicalRepMsgType command, TransactionId xid,
> +                                                         const char *errmsg)
> +{
> +       PgStat_MsgSubWorkerError msg;
> +       int                     len;
> +
> +       Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
> +       len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) +
> strlen(errmsg) + 1;
> +
>
> I could reproduce the problem with the following scenario:
> Publisher:
> create table t1 (c1 varchar);
> create publication pub1 for table t1;
> insert into t1 values(repeat('abcd', 5000));
>
> Subscriber:
> create table t1(c1 smallint);
> create subscription sub1 connection 'dbname=postgres port=5432'
> publication pub1 with ( two_phase = true);
> postgres=# select * from pg_stat_subscription_workers;
> WARNING:  terminating connection because of crash of another server process
> DETAIL:  The postmaster has commanded this server process to roll back
> the current transaction and exit, because another server process
> exited abnormally and possibly corrupted shared memory.
> HINT:  In a moment you should be able to reconnect to the database and
> repeat your command.
> server closed the connection unexpectedly
>    This probably means the server terminated abnormally
>    before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>
> Subscriber logs:
> 2021-11-15 19:27:56.380 IST [15685] LOG:  logical replication apply
> worker for subscription "sub1" has started
> 2021-11-15 19:27:56.384 IST [15687] LOG:  logical replication table
> synchronization worker for subscription "sub1", table "t1" has started
> TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
> File: "pgstat.c", Line: 1946, PID: 15687)
> postgres: logical replication worker for subscription 16387 sync 16384
> (ExceptionalCondition+0xd0)[0x55a18f3c727f]
> postgres: logical replication worker for subscription 16387 sync 16384
> (pgstat_report_subworker_error+0x7a)[0x55a18f126417]
> postgres: logical replication worker for subscription 16387 sync 16384
> (ApplyWorkerMain+0x493)[0x55a18f176611]
> postgres: logical replication worker for subscription 16387 sync 16384
> (StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
> postgres: logical replication worker for subscription 16387 sync 16384
> (+0x54efc0)[0x55a18f134fc0]
> postgres: logical replication worker for subscription 16387 sync 16384
> (+0x54f3af)[0x55a18f1353af]
> postgres: logical replication worker for subscription 16387 sync 16384
> (+0x54e338)[0x55a18f134338]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
> /lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
> postgres: logical replication worker for subscription 16387 sync 16384
> (+0x5498c2)[0x55a18f12f8c2]
> postgres: logical replication worker for subscription 16387 sync 16384
> (PostmasterMain+0x134c)[0x55a18f12f1dd]
> postgres: logical replication worker for subscription 16387 sync 16384
> (+0x43c3d4)[0x55a18f0223d4]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565]
> postgres: logical replication worker for subscription 16387 sync 16384
> (_start+0x2e)[0x55a18ecaf4fe]
> 2021-11-15 19:27:56.483 IST [15645] LOG:  background worker "logical
> replication worker" (PID 15687) was terminated by signal 6: Aborted
> 2021-11-15 19:27:56.483 IST [15645] LOG:  terminating any other active
> server processes
> 2021-11-15 19:27:56.485 IST [15645] LOG:  all server processes
> terminated; reinitializing
>
> Here it fails because of a long error message ""invalid input syntax
> for type smallint:

Good catch!

>
\"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...."
> because we try to insert varchar type data into smallint type.  Maybe
> we should trim the error message in this case.

Right. I've fixed this issue and attached an updated patch.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v23-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch

RE: Skipping logical replication transactions on subscriber side

От

"houzj.fnst@fujitsu.com"

Дата:

17 ноября 2021 г., 06:43:00

On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Right. I've fixed this issue and attached an updated patch.

Hi,

Thanks for updating the patch.
Here are few comments.

1)

+        <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>,
<optional><parameter>relid</parameter> <type>oid</type> </optional> )
 

It seems we should put '<optional>' before the comma(',').


2)
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>subrelid</structfield> <type>oid</type>
+      </para>
+      <para>
+       OID of the relation that the worker is synchronizing; null for the
+       main apply worker
+      </para></entry>
+     </row>

Is the 'subrelid' only used for distinguishing the worker type ? If so, would it
be clear to have a string value here. I recalled the previous version patch has
failure_source column but was removed. Maybe I missed something.


3)
.
+extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool allstats);

I didn't find the code of this functions, maybe we can remove this declaration ?

Best regards,
Hou zj

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

17 ноября 2021 г., 07:58:46

On Wed, Nov 17, 2021 at 9:13 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> 2)
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>subrelid</structfield> <type>oid</type>
> +      </para>
> +      <para>
> +       OID of the relation that the worker is synchronizing; null for the
> +       main apply worker
> +      </para></entry>
> +     </row>
>
> Is the 'subrelid' only used for distinguishing the worker type ?
>

I think it will additionally tell which table sync worker as well.

> If so, would it
> be clear to have a string value here. I recalled the previous version patch has
> failure_source column but was removed. Maybe I missed something.
>

I also don't remember the reason for this but like to know.

I am also reviewing the latest version of the patch and will share
comments/questions sometime today.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 ноября 2021 г., 08:56:03

On Wed, Nov 17, 2021 at 1:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Nov 17, 2021 at 9:13 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > 2)
> > +     <row>
> > +      <entry role="catalog_table_entry"><para role="column_definition">
> > +       <structfield>subrelid</structfield> <type>oid</type>
> > +      </para>
> > +      <para>
> > +       OID of the relation that the worker is synchronizing; null for the
> > +       main apply worker
> > +      </para></entry>
> > +     </row>
> >
> > Is the 'subrelid' only used for distinguishing the worker type ?
> >
>
> I think it will additionally tell which table sync worker as well.

Right.

>
> > If so, would it
> > be clear to have a string value here. I recalled the previous version patch has
> > failure_source column but was removed. Maybe I missed something.
> >
>
> I also don't remember the reason for this but like to know.

I felt it's a bit redundant. Setting subrelid to NULL already means
that it’s an entry for a tablesync worker. If users want the value
like “apply” or “tablesync” for each entry, they can use the subrelid
value.

> I am also reviewing the latest version of the patch and will share
> comments/questions sometime today.

Thanks!

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

17 ноября 2021 г., 09:52:05

On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > >
> > > > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > I've attached an updated patch that incorporates all comments I got so
> > > > > far. Please review it.
> > > > >
> > > >
> > > > Thanks for the updated patch.
> > > > A few minor comments:
> > > >
> > > > doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
> > > >
> > > > (1) tab in doc updates
> > > >
> > > > There's a tab before "Otherwise,":
> > > >
> > > > +        copy of the relation with <parameter>relid</parameter>.
> > > >         Otherwise,
> > >
> > > Fixed.
> > >
> > > >
> > > > src/backend/utils/adt/pgstatfuncs.c
> > > >
> > > > (2) The function comment for "pg_stat_reset_subscription_worker_sub"
> > > > seems a bit long and I expected it to be multi-line (did you run
> > > > pg_indent?)
> > >
> > > I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.
> > >
> > > >
> > > > src/include/pgstat.h
> > > >
> > > > (3) Remove PgStat_StatSubWorkerEntry.dbid?
> > > >
> > > > The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
> > > > seem to be used, so I think it should be removed.
> > > > (I could remove it and everything builds OK and tests pass).
> > > >
> > >
> > > Fixed.
> > >
> > > Thank you for the comments! I've updated an updated version patch.
> >
> > Thanks for the updated patch.
> > I found one issue:
> > This Assert can fail in few cases:
> > +void
> > +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
> > +
> > LogicalRepMsgType command, TransactionId xid,
> > +                                                         const char *errmsg)
> > +{
> > +       PgStat_MsgSubWorkerError msg;
> > +       int                     len;
> > +
> > +       Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
> > +       len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) +
> > strlen(errmsg) + 1;
> > +
> >
> > I could reproduce the problem with the following scenario:
> > Publisher:
> > create table t1 (c1 varchar);
> > create publication pub1 for table t1;
> > insert into t1 values(repeat('abcd', 5000));
> >
> > Subscriber:
> > create table t1(c1 smallint);
> > create subscription sub1 connection 'dbname=postgres port=5432'
> > publication pub1 with ( two_phase = true);
> > postgres=# select * from pg_stat_subscription_workers;
> > WARNING:  terminating connection because of crash of another server process
> > DETAIL:  The postmaster has commanded this server process to roll back
> > the current transaction and exit, because another server process
> > exited abnormally and possibly corrupted shared memory.
> > HINT:  In a moment you should be able to reconnect to the database and
> > repeat your command.
> > server closed the connection unexpectedly
> >    This probably means the server terminated abnormally
> >    before or while processing the request.
> > The connection to the server was lost. Attempting reset: Failed.
> >
> > Subscriber logs:
> > 2021-11-15 19:27:56.380 IST [15685] LOG:  logical replication apply
> > worker for subscription "sub1" has started
> > 2021-11-15 19:27:56.384 IST [15687] LOG:  logical replication table
> > synchronization worker for subscription "sub1", table "t1" has started
> > TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
> > File: "pgstat.c", Line: 1946, PID: 15687)
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (ExceptionalCondition+0xd0)[0x55a18f3c727f]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (pgstat_report_subworker_error+0x7a)[0x55a18f126417]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (ApplyWorkerMain+0x493)[0x55a18f176611]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x54efc0)[0x55a18f134fc0]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x54f3af)[0x55a18f1353af]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x54e338)[0x55a18f134338]
> > /lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
> > /lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x5498c2)[0x55a18f12f8c2]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (PostmasterMain+0x134c)[0x55a18f12f1dd]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (+0x43c3d4)[0x55a18f0223d4]
> > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565]
> > postgres: logical replication worker for subscription 16387 sync 16384
> > (_start+0x2e)[0x55a18ecaf4fe]
> > 2021-11-15 19:27:56.483 IST [15645] LOG:  background worker "logical
> > replication worker" (PID 15687) was terminated by signal 6: Aborted
> > 2021-11-15 19:27:56.483 IST [15645] LOG:  terminating any other active
> > server processes
> > 2021-11-15 19:27:56.485 IST [15645] LOG:  all server processes
> > terminated; reinitializing
> >
> > Here it fails because of a long error message ""invalid input syntax
> > for type smallint:
>
> Good catch!
>
> >
\"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...."
> > because we try to insert varchar type data into smallint type.  Maybe
> > we should trim the error message in this case.
>
> Right. I've fixed this issue and attached an updated patch.

Thanks for the updated patch. The issue is fixed in the patch provided.
I found that in one of the scenarios the statistics is getting lost:
Test steps:
Step 1:
Setup Publisher(create 100 publications pub1...pub100 for t1...t100) like below:
===============================================
create table t1(c1 int);
create publication pub1 for table t1;
insert into t1 values(10);
insert into t1 values(10);
create table t2(c1 int);
create publication pub1 for table t2;
insert into t2 values(10);
insert into t2 values(10);
....

Script can be generated using:
while [ $a -lt 100 ]
do
   a=`expr $a + 1`
        echo "./psql -d postgres -p 5432 -c \"create table t$a(c1
int);\"" >> publisher.sh
        echo "./psql -d postgres -p 5432 -c \"create publication pub$a
for table t$a;\"" >> publisher.sh
        echo "./psql -d postgres -p 5432 -c \"insert into t$a
values(10);\"" >> publisher.sh
        echo "./psql -d postgres -p 5432 -c \"insert into t$a
values(10);\"" >> publisher.sh
done

Step 2:
Setup Subscriber(create 100 subscriptions):
===============================================
create table t1(c1 int primary key);
create subscription sub1 connection 'dbname=postgres port=5432'
publication pub1;
create table t2(c1 int primary key);
create subscription sub2 connection 'dbname=postgres port=5432'
publication pub2;
....

Script can be generated using:
while [ $a -lt 100]
do
   a=`expr $a + 1`
        echo "./psql -d postgres -p 5433 -c \"create table t$a(c1 int
primary key);\"" >> subscriber.sh
        echo "./psql -d postgres -p 5433 -c \"create subscription
sub$a connection 'dbname=postgres port=5432' publication pub$a;\"" >>
subscriber.sh
done

Step 3:
postgres=# select * from pg_stat_subscription_workers order by subid;
subid | subname | subrelid | relid | command | xid | error_count |
error_message | first_error_time | last_error_time

-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------+----------------------------------+----------------------------------
16389 | sub1 | 16384 | 16384 | | | 17 | duplicate key value violates
unique constraint "t1_pkey" | 2021-11-17 12:01:46.141086+05:30 |
2021-11-17 12:03:13.175698+05:30
16395 | sub2 | 16390 | 16390 | | | 16 | duplicate key value violates
unique constraint "t2_pkey" | 2021-11-17 12:01:51.337055+05:30 |
2021-11-17 12:03:15.512249+05:30
16401 | sub3 | 16396 | 16396 | | | 16 | duplicate key value violates
unique constraint "t3_pkey" | 2021-11-17 12:01:51.352157+05:30 |
2021-11-17 12:03:15.802225+05:30
16407 | sub4 | 16402 | 16402 | | | 16 | duplicate key value violates
unique constraint "t4_pkey" | 2021-11-17 12:01:51.390638+05:30 |
2021-11-17 12:03:14.709496+05:30
16413 | sub5 | 16408 | 16408 | | | 16 | duplicate key value violates
unique constraint "t5_pkey" | 2021-11-17 12:01:51.418825+05:30 |
2021-11-17 12:03:15.257235+05:30

Step 4:
Then restart the publisher

Step 5:
postgres=# select * from pg_stat_subscription_workers order by subid;
subid | subname | subrelid | relid | command | xid | error_count |
error_message |
first_error_time | last_error_time

-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------------------------------------------------------------------------------------+-----
-----------------------------+----------------------------------
16389 | sub1 | 16384 | 16384 | | | 1 | could not create replication
slot "pg_16389_sync_16384_7031422794938304519": FATAL: terminating
connection due to administrator command+| 2021
-11-17 12:03:28.201247+05:30 | 2021-11-17 12:03:28.201247+05:30
| | | | | | | server closed the connection unexpectedly +|
|
| | | | | | | This probably means the server terminated abnormally +|
|
| | | | | | | before or while proce |
|
16395 | sub2 | 16390 | 16390 | | | 18 | duplicate key value violates
unique constraint "t2_pkey" | 2021
-11-17 12:01:51.337055+05:30 | 2021-11-17 12:03:23.832585+05:30
16401 | sub3 | 16396 | 16396 | | | 18 | duplicate key value violates
unique constraint "t3_pkey" | 2021
-11-17 12:01:51.352157+05:30 | 2021-11-17 12:03:26.567873+05:30
16407 | sub4 | 16402 | 16402 | | | 1 | could not create replication
slot "pg_16407_sync_16402_7031422794938304519": FATAL: terminating
connection due to administrator command+| 2021
-11-17 12:03:28.196958+05:30 | 2021-11-17 12:03:28.196958+05:30
| | | | | | | server closed the connection unexpectedly +|
|
| | | | | | | This probably means the server terminated abnormally +|
|
| | | | | | | before or while proce |
|
16413 | sub5 | 16408 | 16408 | | | 18 | duplicate key value violates
unique constraint "t5_pkey" | 2021
-11-17 12:01:51.418825+05:30 | 2021-11-17 12:03:25.595697+05:30

Step 6:
postgres=# select * from pg_stat_subscription_workers order by subid;
subid | subname | subrelid | relid | command | xid | error_count |
error_message | first_error_time | last_error_time

-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------+----------------------------------+----------------------------------
16389 | sub1 | 16384 | 16384 | | | 1 | duplicate key value violates
unique constraint "t1_pkey" | 2021-11-17 12:03:33.346514+05:30 |
2021-11-17 12:03:33.346514+05:30
16395 | sub2 | 16390 | 16390 | | | 19 | duplicate key value violates
unique constraint "t2_pkey" | 2021-11-17 12:01:51.337055+05:30 |
2021-11-17 12:03:33.437505+05:30
16401 | sub3 | 16396 | 16396 | | | 19 | duplicate key value violates
unique constraint "t3_pkey" | 2021-11-17 12:01:51.352157+05:30 |
2021-11-17 12:03:33.482954+05:30
16407 | sub4 | 16402 | 16402 | | | 1 | duplicate key value violates
unique constraint "t4_pkey" | 2021-11-17 12:03:33.327489+05:30 |
2021-11-17 12:03:33.327489+05:30
16413 | sub5 | 16408 | 16408 | | | 19 | duplicate key value violates
unique constraint "t5_pkey" | 2021-11-17 12:01:51.418825+05:30 |
2021-11-17 12:03:33.374522+05:30

We can see that sub1 and sub4 statistics are lost, old error_count
value is lost. I'm not sure if this behavior is ok or not. Thoughts?

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 ноября 2021 г., 10:54:07

On Wed, Nov 17, 2021 at 3:52 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > > >
> > > > > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > I've attached an updated patch that incorporates all comments I got so
> > > > > > far. Please review it.
> > > > > >
> > > > >
> > > > > Thanks for the updated patch.
> > > > > A few minor comments:
> > > > >
> > > > > doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
> > > > >
> > > > > (1) tab in doc updates
> > > > >
> > > > > There's a tab before "Otherwise,":
> > > > >
> > > > > +        copy of the relation with <parameter>relid</parameter>.
> > > > >         Otherwise,
> > > >
> > > > Fixed.
> > > >
> > > > >
> > > > > src/backend/utils/adt/pgstatfuncs.c
> > > > >
> > > > > (2) The function comment for "pg_stat_reset_subscription_worker_sub"
> > > > > seems a bit long and I expected it to be multi-line (did you run
> > > > > pg_indent?)
> > > >
> > > > I ran pg_indent on pgstatfuncs.c but it didn't become a multi-line comment.
> > > >
> > > > >
> > > > > src/include/pgstat.h
> > > > >
> > > > > (3) Remove PgStat_StatSubWorkerEntry.dbid?
> > > > >
> > > > > The "dbid" member of the new PgStat_StatSubWorkerEntry struct doesn't
> > > > > seem to be used, so I think it should be removed.
> > > > > (I could remove it and everything builds OK and tests pass).
> > > > >
> > > >
> > > > Fixed.
> > > >
> > > > Thank you for the comments! I've updated an updated version patch.
> > >
> > > Thanks for the updated patch.
> > > I found one issue:
> > > This Assert can fail in few cases:
> > > +void
> > > +pgstat_report_subworker_error(Oid subid, Oid subrelid, Oid relid,
> > > +
> > > LogicalRepMsgType command, TransactionId xid,
> > > +                                                         const char *errmsg)
> > > +{
> > > +       PgStat_MsgSubWorkerError msg;
> > > +       int                     len;
> > > +
> > > +       Assert(strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN);
> > > +       len = offsetof(PgStat_MsgSubWorkerError, m_message[0]) +
> > > strlen(errmsg) + 1;
> > > +
> > >
> > > I could reproduce the problem with the following scenario:
> > > Publisher:
> > > create table t1 (c1 varchar);
> > > create publication pub1 for table t1;
> > > insert into t1 values(repeat('abcd', 5000));
> > >
> > > Subscriber:
> > > create table t1(c1 smallint);
> > > create subscription sub1 connection 'dbname=postgres port=5432'
> > > publication pub1 with ( two_phase = true);
> > > postgres=# select * from pg_stat_subscription_workers;
> > > WARNING:  terminating connection because of crash of another server process
> > > DETAIL:  The postmaster has commanded this server process to roll back
> > > the current transaction and exit, because another server process
> > > exited abnormally and possibly corrupted shared memory.
> > > HINT:  In a moment you should be able to reconnect to the database and
> > > repeat your command.
> > > server closed the connection unexpectedly
> > >    This probably means the server terminated abnormally
> > >    before or while processing the request.
> > > The connection to the server was lost. Attempting reset: Failed.
> > >
> > > Subscriber logs:
> > > 2021-11-15 19:27:56.380 IST [15685] LOG:  logical replication apply
> > > worker for subscription "sub1" has started
> > > 2021-11-15 19:27:56.384 IST [15687] LOG:  logical replication table
> > > synchronization worker for subscription "sub1", table "t1" has started
> > > TRAP: FailedAssertion("strlen(errmsg) < PGSTAT_SUBWORKERERROR_MSGLEN",
> > > File: "pgstat.c", Line: 1946, PID: 15687)
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (ExceptionalCondition+0xd0)[0x55a18f3c727f]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (pgstat_report_subworker_error+0x7a)[0x55a18f126417]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (ApplyWorkerMain+0x493)[0x55a18f176611]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (StartBackgroundWorker+0x23c)[0x55a18f11f7e2]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (+0x54efc0)[0x55a18f134fc0]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (+0x54f3af)[0x55a18f1353af]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (+0x54e338)[0x55a18f134338]
> > > /lib/x86_64-linux-gnu/libpthread.so.0(+0x141f0)[0x7feef84371f0]
> > > /lib/x86_64-linux-gnu/libc.so.6(__select+0x57)[0x7feef81e3ac7]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (+0x5498c2)[0x55a18f12f8c2]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (PostmasterMain+0x134c)[0x55a18f12f1dd]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (+0x43c3d4)[0x55a18f0223d4]
> > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xd5)[0x7feef80fd565]
> > > postgres: logical replication worker for subscription 16387 sync 16384
> > > (_start+0x2e)[0x55a18ecaf4fe]
> > > 2021-11-15 19:27:56.483 IST [15645] LOG:  background worker "logical
> > > replication worker" (PID 15687) was terminated by signal 6: Aborted
> > > 2021-11-15 19:27:56.483 IST [15645] LOG:  terminating any other active
> > > server processes
> > > 2021-11-15 19:27:56.485 IST [15645] LOG:  all server processes
> > > terminated; reinitializing
> > >
> > > Here it fails because of a long error message ""invalid input syntax
> > > for type smallint:
> >
> > Good catch!
> >
> > >
\"abcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabcdabc...."
> > > because we try to insert varchar type data into smallint type.  Maybe
> > > we should trim the error message in this case.
> >
> > Right. I've fixed this issue and attached an updated patch.
>
> Thanks for the updated patch. The issue is fixed in the patch provided.
> I found that in one of the scenarios the statistics is getting lost:

Thank you for the tests!!

>
> Step 3:
> postgres=# select * from pg_stat_subscription_workers order by subid;
> subid | subname | subrelid | relid | command | xid | error_count |
> error_message | first_error_time | last_error_time
>
-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------+----------------------------------+----------------------------------
> 16389 | sub1 | 16384 | 16384 | | | 17 | duplicate key value violates
> unique constraint "t1_pkey" | 2021-11-17 12:01:46.141086+05:30 |
> 2021-11-17 12:03:13.175698+05:30
> 16395 | sub2 | 16390 | 16390 | | | 16 | duplicate key value violates
> unique constraint "t2_pkey" | 2021-11-17 12:01:51.337055+05:30 |
> 2021-11-17 12:03:15.512249+05:30
> 16401 | sub3 | 16396 | 16396 | | | 16 | duplicate key value violates
> unique constraint "t3_pkey" | 2021-11-17 12:01:51.352157+05:30 |
> 2021-11-17 12:03:15.802225+05:30
> 16407 | sub4 | 16402 | 16402 | | | 16 | duplicate key value violates
> unique constraint "t4_pkey" | 2021-11-17 12:01:51.390638+05:30 |
> 2021-11-17 12:03:14.709496+05:30
> 16413 | sub5 | 16408 | 16408 | | | 16 | duplicate key value violates
> unique constraint "t5_pkey" | 2021-11-17 12:01:51.418825+05:30 |
> 2021-11-17 12:03:15.257235+05:30
>
> Step 4:
> Then restart the publisher
>
> Step 5:
> postgres=# select * from pg_stat_subscription_workers order by subid;
> subid | subname | subrelid | relid | command | xid | error_count |
> error_message |
> first_error_time | last_error_time
>
-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------------------------------------------------------------------------------------+-----
> -----------------------------+----------------------------------
> 16389 | sub1 | 16384 | 16384 | | | 1 | could not create replication
> slot "pg_16389_sync_16384_7031422794938304519": FATAL: terminating
> connection due to administrator command+| 2021
> -11-17 12:03:28.201247+05:30 | 2021-11-17 12:03:28.201247+05:30
> | | | | | | | server closed the connection unexpectedly +|
> |
> | | | | | | | This probably means the server terminated abnormally +|
> |
> | | | | | | | before or while proce |
> |
> 16395 | sub2 | 16390 | 16390 | | | 18 | duplicate key value violates
> unique constraint "t2_pkey" | 2021
> -11-17 12:01:51.337055+05:30 | 2021-11-17 12:03:23.832585+05:30
> 16401 | sub3 | 16396 | 16396 | | | 18 | duplicate key value violates
> unique constraint "t3_pkey" | 2021
> -11-17 12:01:51.352157+05:30 | 2021-11-17 12:03:26.567873+05:30
> 16407 | sub4 | 16402 | 16402 | | | 1 | could not create replication
> slot "pg_16407_sync_16402_7031422794938304519": FATAL: terminating
> connection due to administrator command+| 2021
> -11-17 12:03:28.196958+05:30 | 2021-11-17 12:03:28.196958+05:30
> | | | | | | | server closed the connection unexpectedly +|
> |
> | | | | | | | This probably means the server terminated abnormally +|
> |
> | | | | | | | before or while proce |
> |
> 16413 | sub5 | 16408 | 16408 | | | 18 | duplicate key value violates
> unique constraint "t5_pkey" | 2021
> -11-17 12:01:51.418825+05:30 | 2021-11-17 12:03:25.595697+05:30
>
> Step 6:
> postgres=# select * from pg_stat_subscription_workers order by subid;
> subid | subname | subrelid | relid | command | xid | error_count |
> error_message | first_error_time | last_error_time
>
-------+---------+----------+-------+---------+-----+-------------+------------------------------------------------------------+----------------------------------+----------------------------------
> 16389 | sub1 | 16384 | 16384 | | | 1 | duplicate key value violates
> unique constraint "t1_pkey" | 2021-11-17 12:03:33.346514+05:30 |
> 2021-11-17 12:03:33.346514+05:30
> 16395 | sub2 | 16390 | 16390 | | | 19 | duplicate key value violates
> unique constraint "t2_pkey" | 2021-11-17 12:01:51.337055+05:30 |
> 2021-11-17 12:03:33.437505+05:30
> 16401 | sub3 | 16396 | 16396 | | | 19 | duplicate key value violates
> unique constraint "t3_pkey" | 2021-11-17 12:01:51.352157+05:30 |
> 2021-11-17 12:03:33.482954+05:30
> 16407 | sub4 | 16402 | 16402 | | | 1 | duplicate key value violates
> unique constraint "t4_pkey" | 2021-11-17 12:03:33.327489+05:30 |
> 2021-11-17 12:03:33.327489+05:30
> 16413 | sub5 | 16408 | 16408 | | | 19 | duplicate key value violates
> unique constraint "t5_pkey" | 2021-11-17 12:01:51.418825+05:30 |
> 2021-11-17 12:03:33.374522+05:30
>
> We can see that sub1 and sub4 statistics are lost, old error_count
> value is lost. I'm not sure if this behavior is ok or not. Thoughts?
>

Looking at the outputs of steps 3, 5, and 6, the error messages are
different. In the current design, error_count is incremented only when
the exact same error (i.g., xid, command, relid, error message are the
same) comes. Since some different kinds of errors happened on the
subscription the error_count was reset. Similarly, the
first_error_time value was also reset.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

17 ноября 2021 г., 13:45:49

On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > >
> > > > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
>
> Right. I've fixed this issue and attached an updated patch.

Few comments:
1) should we set subwentry to NULL to handle !create && !found case
or we could return NULL similar to the earlier function.
+static PgStat_StatSubWorkerEntry *
+pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid,
Oid subrelid,
+                                                  bool create)
+{
+       PgStat_StatSubWorkerEntry *subwentry;
+       PgStat_StatSubWorkerKey key;
+       bool            found;
+       HASHACTION      action = (create ? HASH_ENTER : HASH_FIND);
+
+       key.subid = subid;
+       key.subrelid = subrelid;
+       subwentry = (PgStat_StatSubWorkerEntry *)
hash_search(dbentry->subworkers,
+
                                           (void *) &key,
+
                                           action, &found);
+
+       /* If not found, initialize the new one */
+       if (create && !found)

2) Should we keep the line width to 80 chars:
+/* ----------
+ * PgStat_MsgSubWorkerError            Sent by the apply worker or
the table sync worker to
+ *                                                             report
the error occurred during logical replication.
+ * ----------
+ */
+#define PGSTAT_SUBWORKERERROR_MSGLEN 256
+typedef struct PgStat_MsgSubWorkerError
+{

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

17 ноября 2021 г., 14:14:11

On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Right. I've fixed this issue and attached an updated patch.
>

Few comments/questions:
=====================
1.
+  <para>
+   The <structname>pg_stat_subscription_workers</structname> view will contain
+   one row per subscription error reported by workers applying logical
+   replication changes and workers handling the initial data copy of the
+   subscribed tables.  The statistics entry is removed when the subscription
+   the worker is running on is removed.
+  </para>

The last line of this paragraph is not clear to me. First "the" before
"worker" in the following part of the sentence seems unnecessary
"..when the subscription the worker..". Then the part "running on is
removed" is unclear because it could also mean that we remove the
entry when a subscription is disabled. Can we rephrase it to: "The
statistics entry is removed when the corresponding subscription is
dropped"?

2.
Between v20 and v23 versions of patch the size of hash table
PGSTAT_SUBWORKER_HASH_SIZE is increased from 32 to 256. I might have
missed the comment which lead to this change, can you point me to the
same or if you changed it for some other reason, can you let me know
the same?

3.
+
+ /*
+ * Repeat for subscription workers.  Similarly, we needn't bother
+ * in the common case where no function stats are being collected.
+ */

/function/subscription workers'

4.
+      <para>
+       Name of command being applied when the error occurred.  This field
+       is always NULL if the error was reported during the initial data
+       copy.
+      </para></entry>
+     </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>xid</structfield> <type>xid</type>
+      </para>
+      <para>
+       Transaction ID of the publisher node being applied when the error
+       occurred.  This field is always NULL if the error was reported
+       during the initial data copy.
+      </para></entry>

Is it important to stress on 'always' in the above two descriptions?

5.
The current description of first/last_error_time seems sliglthy
misleading as one can interpret that these are about different errors.
Let's slightly change the description of first/last_error_time as
follows or something on those lines:

</para>
+      <para>
+       Time at which the first error occurred
+      </para></entry>
+     </row>

First time at which this error occurred

<structfield>last_error_time</structfield> <type>timestamp with time zone</type>
+      </para>
+      <para>
+       Time at which the last error occurred

Last time at which this error occurred. This will be the same as
first_error_time except when the same error occurred more than once
consecutively.

6.
+        </indexterm>
+        <function>pg_stat_reset_subscription_worker</function> (
<parameter>subid</parameter> <type>oid</type>, <optional>
<parameter>relid</parameter> <type>oid</type> </optional> )
+        <returnvalue>void</returnvalue>
+       </para>
+       <para>
+        Resets the statistics of a single subscription worker running on the
+        subscription with <parameter>subid</parameter> shown in the
+        <structname>pg_stat_subscription_worker</structname> view.  If the
+        argument <parameter>relid</parameter> is not <literal>NULL</literal>,
+        resets statistics of the subscription worker handling the initial data
+        copy of the relation with <parameter>relid</parameter>.  Otherwise,
+        resets the subscription worker statistics of the main apply worker.
+        If the argument <parameter>relid</parameter> is omitted, resets the
+        statistics of all subscription workers running on the subscription
+        with <parameter>subid</parameter>.
+       </para>

The first line of this description seems to indicate that we can only
reset the stats of a single worker but the later part indicates that
we can reset stats of all subscription workers. Can we change the
first line as: "Resets the statistics of subscription workers running
on the subscription with <parameter>subid</parameter> shown in the
<structname>pg_stat_subscription_worker</structname> view.".

7.
pgstat_vacuum_stat()
{
..
+ pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
+ spmsg.m_databaseid = MyDatabaseId;
+ spmsg.m_nentries = 0;
..
}

Do we really need to set the header here? It seems to be getting set
in pgstat_send_subscription_purge() while sending this message.

8.
pgstat_vacuum_stat()
{
..
+
+ if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL)
+ != NULL)
+ continue;
+
+ /* This subscription is dead, add the subid to the message */
+ spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid;
..
}

I think it is better to use a separate variable here for subid as we
are using for funcid and tableid. That will make this part of the code
easier to follow and look consistent.

9.
+/* ----------
+ * PgStat_MsgSubWorkerError Sent by the apply worker or the table
sync worker to
+ * report the error occurred during logical replication.
+ * ----------

In this comment "during logical replication" sounds too generic. Can
we instead use "while processing changes." or something like that to
make it a bit more specific?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

17 ноября 2021 г., 14:17:18

On Wed, Nov 17, 2021 at 4:16 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Few comments:
> 1) should we set subwentry to NULL to handle !create && !found case
> or we could return NULL similar to the earlier function.
>

I think it is good to be consistent with the nearby code in this case.

-- 
With Regards,
Amit Kapila.

RE: Skipping logical replication transactions on subscriber side

От

"houzj.fnst@fujitsu.com"

Дата:

18 ноября 2021 г., 06:52:23

On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Right. I've fixed this issue and attached an updated patch.
> 
Hi,

I have few comments for the testcases.

1)

+my $appname = 'tap_sub';
+$node_subscriber->safe_psql(
+    'postgres',
+    "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH
(streaming= off, two_phase = on);");
 
+my $appname_streaming = 'tap_sub_streaming';
+$node_subscriber->safe_psql(
+    'postgres',
+    "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming'
PUBLICATIONtap_pub_streaming WITH (streaming = on, two_phase = on);");
 
+

I think we can remove the 'application_name=$appname', so that the command
could be shorter. 

2)
+...(streaming = on, two_phase = on);");
Besides, is there some reasons to set two_phase to ? If so,
It might be better to add some comments about it.


3)
+CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
+]);
+

It seems there's no tests to use the table test_tab_streaming. I guess this
table is used to test streaming change error, maybe we can add some tests for
it ?

Best regards,
Hou zj

RE: Skipping logical replication transactions on subscriber side

От

"tanghy.fnst@fujitsu.com"

Дата:

18 ноября 2021 г., 11:45:29

On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> Right. I've fixed this issue and attached an updated patch.
> 
>

Thanks for your patch.

I read the discussion about stats entries for table sync worker[1], the
statistics are retained after table sync worker finished its jobs and user can remove
them via pg_stat_reset_subscription_worker function.

But I notice that, if a table sync worker finished its jobs, the error reported by
this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition: "WHERE
srsubstate<> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are still
exist,and won't remove the statistics manually. And that is not friendly to users' storage, right?

[1] https://www.postgresql.org/message-id/CAD21AoAT42mhcqeB1jPfRL1%2BEUHbZk8MMY_fBgsyZvJeKNpG%2Bw%40mail.gmail.com

Regards
Tang

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 ноября 2021 г., 14:39:44

On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Right. I've fixed this issue and attached an updated patch.
> >
> >
>
> Thanks for your patch.
>
> I read the discussion about stats entries for table sync worker[1], the
> statistics are retained after table sync worker finished its jobs and user can remove
> them via pg_stat_reset_subscription_worker function.
>
> But I notice that, if a table sync worker finished its jobs, the error reported by
> this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition: "WHERE
srsubstate<> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are still
exist,and won't remove the statistics manually. And that is not friendly to users' storage, right? 
>

You're right. The condition "WHERE substate <> 'r') should be removed.
I'll do that change in the next version patch. Thanks!

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 ноября 2021 г., 16:59:28

On Wed, Nov 17, 2021 at 12:43 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tues, Nov 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Right. I've fixed this issue and attached an updated patch.
>
> Hi,
>
> Thanks for updating the patch.
> Here are few comments.

Thank you for the comments!

>
> 1)
>
> +        <function>pg_stat_reset_subscription_worker</function> ( <parameter>subid</parameter> <type>oid</type>,
<optional><parameter>relid</parameter> <type>oid</type> </optional> )
 
>
> It seems we should put '<optional>' before the comma(',').

Will fix.

>
>
> 2)
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>subrelid</structfield> <type>oid</type>
> +      </para>
> +      <para>
> +       OID of the relation that the worker is synchronizing; null for the
> +       main apply worker
> +      </para></entry>
> +     </row>
>
> Is the 'subrelid' only used for distinguishing the worker type ? If so, would it
> be clear to have a string value here. I recalled the previous version patch has
> failure_source column but was removed. Maybe I missed something.

As Amit mentioned, users can use this check which table sync worker.

>
>
> 3)
> .
> +extern void pgstat_reset_subworker_stats(Oid subid, Oid subrelid, bool allstats);
>
> I didn't find the code of this functions, maybe we can remove this declaration ?

Will remove.

I'll submit an updated patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 ноября 2021 г., 16:59:56

On Wed, Nov 17, 2021 at 7:46 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Nov 15, 2021 at 11:43 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Mon, Nov 15, 2021 at 2:48 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Nov 15, 2021 at 4:49 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > > >
> > > > > On Mon, Nov 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> >
> > Right. I've fixed this issue and attached an updated patch.
>
> Few comments:

Thank you for the comments!

> 1) should we set subwentry to NULL to handle !create && !found case
> or we could return NULL similar to the earlier function.
> +static PgStat_StatSubWorkerEntry *
> +pgstat_get_subworker_entry(PgStat_StatDBEntry *dbentry, Oid subid,
> Oid subrelid,
> +                                                  bool create)
> +{
> +       PgStat_StatSubWorkerEntry *subwentry;
> +       PgStat_StatSubWorkerKey key;
> +       bool            found;
> +       HASHACTION      action = (create ? HASH_ENTER : HASH_FIND);
> +
> +       key.subid = subid;
> +       key.subrelid = subrelid;
> +       subwentry = (PgStat_StatSubWorkerEntry *)
> hash_search(dbentry->subworkers,
> +
>                                            (void *) &key,
> +
>                                            action, &found);
> +
> +       /* If not found, initialize the new one */
> +       if (create && !found)

It's better to return NULL if !create && !found. WIll fix.

>
> 2) Should we keep the line width to 80 chars:
> +/* ----------
> + * PgStat_MsgSubWorkerError            Sent by the apply worker or
> the table sync worker to
> + *                                                             report
> the error occurred during logical replication.
> + * ----------
> + */
> +#define PGSTAT_SUBWORKERERROR_MSGLEN 256
> +typedef struct PgStat_MsgSubWorkerError
> +{

Hmm, pg_indent seems not to fix it. Anyway, will fix.

I'll fix an updated patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

19 ноября 2021 г., 05:07:05

On Tue, Nov 16, 2021 at 5:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Right. I've fixed this issue and attached an updated patch.
>

A couple of comments for the v23 patch:

doc/src/sgml/monitoring.sgml
(1) inconsistent decription
I think that the following description seems inconsistent with the
previous description given above it in the patch (i.e. "One row per
subscription worker, showing statistics about errors that occurred on
that subscription worker"):

"The <structname>pg_stat_subscription_workers</structname> view will
contain one row per subscription error reported by workers applying
logical replication changes and workers handling the initial data copy
of the subscribed tables."

I think it is inconsistent because it implies there could be multiple
subscription error rows for the same worker.
Maybe the following wording could be used instead, or something similar:

"The <structname>pg_stat_subscription_workers</structname> view will
contain one row per subscription worker on which errors have occurred,
for workers applying logical replication changes and workers handling
the initial data copy of the subscribed tables."

(2) null vs NULL
The "subrelid" column description uses "null" but the "command" column
description uses "NULL".
I think "NULL" should be used for consistency.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

19 ноября 2021 г., 06:51:52

On Thu, Nov 18, 2021 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Right. I've fixed this issue and attached an updated patch.
> > >
> > >
> >
> > Thanks for your patch.
> >
> > I read the discussion about stats entries for table sync worker[1], the
> > statistics are retained after table sync worker finished its jobs and user can remove
> > them via pg_stat_reset_subscription_worker function.
> >
> > But I notice that, if a table sync worker finished its jobs, the error reported by
> > this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition: "WHERE
srsubstate<> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are still
exist,and won't remove the statistics manually. And that is not friendly to users' storage, right? 
> >
>
> You're right. The condition "WHERE substate <> 'r') should be removed.
> I'll do that change in the next version patch. Thanks!
>

One more thing you might want to consider for the next version is
whether to rename the columns as discussed in the related thread [1]?
I think we should consider future work and name them accordingly.

[1] - https://www.postgresql.org/message-id/CAA4eK1KR41bRUuPeNBSGv2%2Bq7ROKukS3myeAUqrZMD8MEwR0DQ%40mail.gmail.com

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

19 ноября 2021 г., 08:38:56

On Fri, Nov 19, 2021 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Nov 18, 2021 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Right. I've fixed this issue and attached an updated patch.
> > > >
> > > >
> > >
> > > Thanks for your patch.
> > >
> > > I read the discussion about stats entries for table sync worker[1], the
> > > statistics are retained after table sync worker finished its jobs and user can remove
> > > them via pg_stat_reset_subscription_worker function.
> > >
> > > But I notice that, if a table sync worker finished its jobs, the error reported by
> > > this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition:
"WHEREsrsubstate <> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are
stillexist, and won't remove the statistics manually. And that is not friendly to users' storage, right? 
> > >
> >
> > You're right. The condition "WHERE substate <> 'r') should be removed.
> > I'll do that change in the next version patch. Thanks!
> >
>
> One more thing you might want to consider for the next version is
> whether to rename the columns as discussed in the related thread [1]?
> I think we should consider future work and name them accordingly.
>
> [1] - https://www.postgresql.org/message-id/CAA4eK1KR41bRUuPeNBSGv2%2Bq7ROKukS3myeAUqrZMD8MEwR0DQ%40mail.gmail.com

Since the statistics collector process uses UDP socket, the sequencing
of the messages is not guaranteed. Will there be a problem if
Subscription is dropped and stats collector receives
PGSTAT_MTYPE_SUBSCRIPTIONPURGE first and the subscription worker entry
is removed and then receives PGSTAT_MTYPE_SUBWORKERERROR(this order
can happen because of UDP socket). I'm not sure if the Assert will be
a problem in this case. If this scenario is possible we could just
silently return in that case.

+static void
+pgstat_recv_subworker_error(PgStat_MsgSubWorkerError *msg, int len)
+{
+       PgStat_StatDBEntry *dbentry;
+       PgStat_StatSubWorkerEntry *subwentry;
+
+       dbentry = pgstat_get_db_entry(msg->m_databaseid, true);
+
+       /* Get the subscription worker stats */
+       subwentry = pgstat_get_subworker_entry(dbentry, msg->m_subid,
+
            msg->m_subrelid, true);
+       Assert(subwentry);
+
+       /*
+        * Update only the counter and last error timestamp if we received
+        * the same error again
+        */

Thoughts?

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

19 ноября 2021 г., 09:32:23

On Fri, Nov 19, 2021 at 4:39 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Since the statistics collector process uses UDP socket, the sequencing
> of the messages is not guaranteed. Will there be a problem if
> Subscription is dropped and stats collector receives
> PGSTAT_MTYPE_SUBSCRIPTIONPURGE first and the subscription worker entry
> is removed and then receives PGSTAT_MTYPE_SUBWORKERERROR(this order
> can happen because of UDP socket). I'm not sure if the Assert will be
> a problem in this case. If this scenario is possible we could just
> silently return in that case.
>

Given that the message sequencing is not guaranteed, it looks like
that Assert and the current code after it won't handle that scenario
well. Silently returning if subwentry is NULL does seem like the way
to deal with that possibility.
Doesn't this possibility of out-of-sequence messaging due to UDP
similarly mean that "first_error_time" and "last_error_time" may not
be currently handled correctly?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

19 ноября 2021 г., 09:52:37

On Fri, Nov 19, 2021 at 11:09 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Nov 19, 2021 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Nov 18, 2021 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
> > > <tanghy.fnst@fujitsu.com> wrote:
> > > >
> > > > On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > Right. I've fixed this issue and attached an updated patch.
> > > > >
> > > > >
> > > >
> > > > Thanks for your patch.
> > > >
> > > > I read the discussion about stats entries for table sync worker[1], the
> > > > statistics are retained after table sync worker finished its jobs and user can remove
> > > > them via pg_stat_reset_subscription_worker function.
> > > >
> > > > But I notice that, if a table sync worker finished its jobs, the error reported by
> > > > this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition:
"WHEREsrsubstate <> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are
stillexist, and won't remove the statistics manually. And that is not friendly to users' storage, right? 
> > > >
> > >
> > > You're right. The condition "WHERE substate <> 'r') should be removed.
> > > I'll do that change in the next version patch. Thanks!
> > >
> >
> > One more thing you might want to consider for the next version is
> > whether to rename the columns as discussed in the related thread [1]?
> > I think we should consider future work and name them accordingly.
> >
> > [1] - https://www.postgresql.org/message-id/CAA4eK1KR41bRUuPeNBSGv2%2Bq7ROKukS3myeAUqrZMD8MEwR0DQ%40mail.gmail.com
>
> Since the statistics collector process uses UDP socket, the sequencing
> of the messages is not guaranteed. Will there be a problem if
> Subscription is dropped and stats collector receives
> PGSTAT_MTYPE_SUBSCRIPTIONPURGE first and the subscription worker entry
> is removed and then receives PGSTAT_MTYPE_SUBWORKERERROR(this order
> can happen because of UDP socket). I'm not sure if the Assert will be
> a problem in this case.
>

Why that Assert will hit? We seem to be always passing 'create' as
true so it should create a new entry. I think a similar situation can
happen for functions and it will be probably cleaned in the next
vacuum cycle.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

19 ноября 2021 г., 10:08:06

On Fri, Nov 19, 2021 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 19, 2021 at 11:09 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Fri, Nov 19, 2021 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Nov 18, 2021 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Nov 18, 2021 at 5:45 PM tanghy.fnst@fujitsu.com
> > > > <tanghy.fnst@fujitsu.com> wrote:
> > > > >
> > > > > On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > Right. I've fixed this issue and attached an updated patch.
> > > > > >
> > > > > >
> > > > >
> > > > > Thanks for your patch.
> > > > >
> > > > > I read the discussion about stats entries for table sync worker[1], the
> > > > > statistics are retained after table sync worker finished its jobs and user can remove
> > > > > them via pg_stat_reset_subscription_worker function.
> > > > >
> > > > > But I notice that, if a table sync worker finished its jobs, the error reported by
> > > > > this worker will not be shown in the pg_stat_subscription_workers view. (It seemed caused by this condition:
"WHEREsrsubstate <> 'r'") Is it intentional? I think this may cause a result that users don't know the statistics are
stillexist, and won't remove the statistics manually. And that is not friendly to users' storage, right? 
> > > > >
> > > >
> > > > You're right. The condition "WHERE substate <> 'r') should be removed.
> > > > I'll do that change in the next version patch. Thanks!
> > > >
> > >
> > > One more thing you might want to consider for the next version is
> > > whether to rename the columns as discussed in the related thread [1]?
> > > I think we should consider future work and name them accordingly.
> > >
> > > [1] -
https://www.postgresql.org/message-id/CAA4eK1KR41bRUuPeNBSGv2%2Bq7ROKukS3myeAUqrZMD8MEwR0DQ%40mail.gmail.com
> >
> > Since the statistics collector process uses UDP socket, the sequencing
> > of the messages is not guaranteed. Will there be a problem if
> > Subscription is dropped and stats collector receives
> > PGSTAT_MTYPE_SUBSCRIPTIONPURGE first and the subscription worker entry
> > is removed and then receives PGSTAT_MTYPE_SUBWORKERERROR(this order
> > can happen because of UDP socket). I'm not sure if the Assert will be
> > a problem in this case.
> >
>
> Why that Assert will hit? We seem to be always passing 'create' as
> true so it should create a new entry. I think a similar situation can
> happen for functions and it will be probably cleaned in the next
> vacuum cycle.

Since we are passing true that Assert will not hit, sorry I missed to
notice that. It will create a new entry as you rightly pointed out.
Since the cleaning is handled by vacuum and current code is also doing
that way, I felt no need to make any change.

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

19 ноября 2021 г., 10:51:54

On Fri, Nov 19, 2021 at 5:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Why that Assert will hit? We seem to be always passing 'create' as
> true so it should create a new entry. I think a similar situation can
> happen for functions and it will be probably cleaned in the next
> vacuum cycle.
>
Oops, I missed that too. So at worst, vacuum will clean it up in the
out-of-order SUBSCRIPTIONPURGE,SUBWORKERERROR case.

But I still think the current code may not correctly handle
first_error_time/last_error_time timestamps if out-of-order
SUBWORKERERROR messages occur, right?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

19 ноября 2021 г., 12:14:55

On Fri, Nov 19, 2021 at 1:22 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Nov 19, 2021 at 5:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Why that Assert will hit? We seem to be always passing 'create' as
> > true so it should create a new entry. I think a similar situation can
> > happen for functions and it will be probably cleaned in the next
> > vacuum cycle.
> >
> Oops, I missed that too. So at worst, vacuum will clean it up in the
> out-of-order SUBSCRIPTIONPURGE,SUBWORKERERROR case.
>
> But I still think the current code may not correctly handle
> first_error_time/last_error_time timestamps if out-of-order
> SUBWORKERERROR messages occur, right?
>

Yeah in such a case last_error_time can be shown as a time before
first_error_time but I don't think that will be a big problem, the
next message will fix it. I don't see what we can do about it and the
same is true for other cases like pg_stat_archiver where the success
and failure times can be out of order. If we want we can remove one of
those times but I don't think this happens frequently enough to be
considered a problem. Anyway, these stats are not considered to be
updated with the most latest info.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

19 ноября 2021 г., 12:30:42

On Fri, Nov 19, 2021 at 8:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> Yeah in such a case last_error_time can be shown as a time before
> first_error_time but I don't think that will be a big problem, the
> next message will fix it. I don't see what we can do about it and the
> same is true for other cases like pg_stat_archiver where the success
> and failure times can be out of order. If we want we can remove one of
> those times but I don't think this happens frequently enough to be
> considered a problem. Anyway, these stats are not considered to be
> updated with the most latest info.
>

Couldn't the code block in pgstat_recv_subworker_error() that
increments error_count just compare the new msg timestamp against the
existing first_error_time and last_error_time and, based on the
result, update those if required?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

19 ноября 2021 г., 13:09:32

On Fri, Nov 19, 2021 at 3:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Nov 19, 2021 at 8:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Yeah in such a case last_error_time can be shown as a time before
> > first_error_time but I don't think that will be a big problem, the
> > next message will fix it. I don't see what we can do about it and the
> > same is true for other cases like pg_stat_archiver where the success
> > and failure times can be out of order. If we want we can remove one of
> > those times but I don't think this happens frequently enough to be
> > considered a problem. Anyway, these stats are not considered to be
> > updated with the most latest info.
> >
>
> Couldn't the code block in pgstat_recv_subworker_error() that
> increments error_count just compare the new msg timestamp against the
> existing first_error_time and last_error_time and, based on the
> result, update those if required?
>

I don't see any problem with that but let's see what Sawada-San has to
say about this?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

19 ноября 2021 г., 16:19:02

On Fri, Nov 19, 2021 at 7:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 19, 2021 at 3:00 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Fri, Nov 19, 2021 at 8:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > Yeah in such a case last_error_time can be shown as a time before
> > > first_error_time but I don't think that will be a big problem, the
> > > next message will fix it. I don't see what we can do about it and the
> > > same is true for other cases like pg_stat_archiver where the success
> > > and failure times can be out of order. If we want we can remove one of
> > > those times but I don't think this happens frequently enough to be
> > > considered a problem. Anyway, these stats are not considered to be
> > > updated with the most latest info.
> > >
> >
> > Couldn't the code block in pgstat_recv_subworker_error() that
> > increments error_count just compare the new msg timestamp against the
> > existing first_error_time and last_error_time and, based on the
> > result, update those if required?
> >
>
> I don't see any problem with that but let's see what Sawada-San has to
> say about this?

IMO not sure we should do that. Since the stats collector will not
likely to receive the same error report frequently in practice (5 sec
interval by default), perhaps this problem will unlikely to happen.
Even if the same messages are reported frequently enough to cause this
problem, the next message will also be reported soon, fixing it soon,
as Amit mentioned. Also, IIUC once we have the shared memory based
stats collector, we won’t need to worry about this problem. Given that
this kind of problem potentially exists also in other stats views that
have timestamp values, I’m not sure it's worth dealing with this
problem only in pg_stat_subscription_workers view.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

24 ноября 2021 г., 05:20:23

On Thu, Nov 18, 2021 at 12:52 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Right. I've fixed this issue and attached an updated patch.
> >
> Hi,
>
> I have few comments for the testcases.
>
> 1)
>
> +my $appname = 'tap_sub';
> +$node_subscriber->safe_psql(
> +    'postgres',
> +    "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub WITH
(streaming= off, two_phase = on);");
 
> +my $appname_streaming = 'tap_sub_streaming';
> +$node_subscriber->safe_psql(
> +    'postgres',
> +    "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming'
PUBLICATIONtap_pub_streaming WITH (streaming = on, two_phase = on);");
 
> +
>
> I think we can remove the 'application_name=$appname', so that the command
> could be shorter.

But we wait for the subscription to catch up by using
wait_for_catchup() with application_name, no?

>
> 2)
> +...(streaming = on, two_phase = on);");
> Besides, is there some reasons to set two_phase to ? If so,
> It might be better to add some comments about it.
>

Yes, two_phase = on is required by the tests for skip transaction
patch. WIll remove it.

>
> 3)
> +CREATE PUBLICATION tap_pub_streaming FOR TABLE test_tab_streaming;
> +]);
> +
>
> It seems there's no tests to use the table test_tab_streaming. I guess this
> table is used to test streaming change error, maybe we can add some tests for
> it ?

Oops, similarly this is also required by the skip transaction tests.
Will remove it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

24 ноября 2021 г., 06:13:17

On Wed, Nov 24, 2021 at 7:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Nov 18, 2021 at 12:52 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > Right. I've fixed this issue and attached an updated patch.
> > >
> > Hi,
> >
> > I have few comments for the testcases.
> >
> > 1)
> >
> > +my $appname = 'tap_sub';
> > +$node_subscriber->safe_psql(
> > +    'postgres',
> > +    "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub
WITH(streaming = off, two_phase = on);");
 
> > +my $appname_streaming = 'tap_sub_streaming';
> > +$node_subscriber->safe_psql(
> > +    'postgres',
> > +    "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming'
PUBLICATIONtap_pub_streaming WITH (streaming = on, two_phase = on);");
 
> > +
> >
> > I think we can remove the 'application_name=$appname', so that the command
> > could be shorter.
>
> But we wait for the subscription to catch up by using
> wait_for_catchup() with application_name, no?
>

Yeah, but you can directly use the subscription name in
wait_for_catchup because we internally use that as
fallback_application_name. If application_name is not specified in the
connection string as suggested by Hou-San then
fallback_application_name will be considered. Both ways are okay and I
see we use both ways in the tests but it seems there are more places
where we use the method Hou-San is suggesting in subscription tests.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

24 ноября 2021 г., 11:19:40

On Wed, Nov 24, 2021 at 12:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Nov 24, 2021 at 7:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Nov 18, 2021 at 12:52 PM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > Right. I've fixed this issue and attached an updated patch.
> > > >
> > > Hi,
> > >
> > > I have few comments for the testcases.
> > >
> > > 1)
> > >
> > > +my $appname = 'tap_sub';
> > > +$node_subscriber->safe_psql(
> > > +    'postgres',
> > > +    "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub
WITH(streaming = off, two_phase = on);");
 
> > > +my $appname_streaming = 'tap_sub_streaming';
> > > +$node_subscriber->safe_psql(
> > > +    'postgres',
> > > +    "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming'
PUBLICATIONtap_pub_streaming WITH (streaming = on, two_phase = on);");
 
> > > +
> > >
> > > I think we can remove the 'application_name=$appname', so that the command
> > > could be shorter.
> >
> > But we wait for the subscription to catch up by using
> > wait_for_catchup() with application_name, no?
> >
>
> Yeah, but you can directly use the subscription name in
> wait_for_catchup because we internally use that as
> fallback_application_name. If application_name is not specified in the
> connection string as suggested by Hou-San then
> fallback_application_name will be considered. Both ways are okay and I
> see we use both ways in the tests but it seems there are more places
> where we use the method Hou-San is suggesting in subscription tests.

Okay, thanks! I referred to tests that set application_name. ISTM it's
better to unite them so as not to confuse them in future tests.

Anyway, I'll remove it in the next version patch that I'll submit soon.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

24 ноября 2021 г., 11:50:24

On Wed, Nov 24, 2021 at 1:50 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Nov 24, 2021 at 12:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Nov 24, 2021 at 7:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Nov 18, 2021 at 12:52 PM houzj.fnst@fujitsu.com
> > > <houzj.fnst@fujitsu.com> wrote:
> > > >
> > > > On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > Right. I've fixed this issue and attached an updated patch.
> > > > >
> > > > Hi,
> > > >
> > > > I have few comments for the testcases.
> > > >
> > > > 1)
> > > >
> > > > +my $appname = 'tap_sub';
> > > > +$node_subscriber->safe_psql(
> > > > +    'postgres',
> > > > +    "CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub
WITH(streaming = off, two_phase = on);");
 
> > > > +my $appname_streaming = 'tap_sub_streaming';
> > > > +$node_subscriber->safe_psql(
> > > > +    'postgres',
> > > > +    "CREATE SUBSCRIPTION tap_sub_streaming CONNECTION '$publisher_connstr application_name=$appname_streaming'
PUBLICATIONtap_pub_streaming WITH (streaming = on, two_phase = on);");
 
> > > > +
> > > >
> > > > I think we can remove the 'application_name=$appname', so that the command
> > > > could be shorter.
> > >
> > > But we wait for the subscription to catch up by using
> > > wait_for_catchup() with application_name, no?
> > >
> >
> > Yeah, but you can directly use the subscription name in
> > wait_for_catchup because we internally use that as
> > fallback_application_name. If application_name is not specified in the
> > connection string as suggested by Hou-San then
> > fallback_application_name will be considered. Both ways are okay and I
> > see we use both ways in the tests but it seems there are more places
> > where we use the method Hou-San is suggesting in subscription tests.
>
> Okay, thanks! I referred to tests that set application_name. ISTM it's
> better to unite them so as not to confuse them in future tests.
>

Agreed, but let's do this clean-up as a separate patch. Feel free to
submit the patch for the same in a separate thread.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

24 ноября 2021 г., 14:43:35

On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Right. I've fixed this issue and attached an updated patch.
> >
>
> Few comments/questions:
> =====================
> 1.
> +  <para>
> +   The <structname>pg_stat_subscription_workers</structname> view will contain
> +   one row per subscription error reported by workers applying logical
> +   replication changes and workers handling the initial data copy of the
> +   subscribed tables.  The statistics entry is removed when the subscription
> +   the worker is running on is removed.
> +  </para>
>
> The last line of this paragraph is not clear to me. First "the" before
> "worker" in the following part of the sentence seems unnecessary
> "..when the subscription the worker..". Then the part "running on is
> removed" is unclear because it could also mean that we remove the
> entry when a subscription is disabled. Can we rephrase it to: "The
> statistics entry is removed when the corresponding subscription is
> dropped"?

Agreed. Fixed.

>
> 2.
> Between v20 and v23 versions of patch the size of hash table
> PGSTAT_SUBWORKER_HASH_SIZE is increased from 32 to 256. I might have
> missed the comment which lead to this change, can you point me to the
> same or if you changed it for some other reason, can you let me know
> the same?

I'd missed reverting this change. I considered increasing this value
since the lifetime of subscription is long. But when it comes to
unshared hashtable can be expanded on-the-fly, it's better to start
with a small value. Reverted.

>
> 3.
> +
> + /*
> + * Repeat for subscription workers.  Similarly, we needn't bother
> + * in the common case where no function stats are being collected.
> + */
>
> /function/subscription workers'

Fixed.

>
> 4.
> +      <para>
> +       Name of command being applied when the error occurred.  This field
> +       is always NULL if the error was reported during the initial data
> +       copy.
> +      </para></entry>
> +     </row>
> +
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>xid</structfield> <type>xid</type>
> +      </para>
> +      <para>
> +       Transaction ID of the publisher node being applied when the error
> +       occurred.  This field is always NULL if the error was reported
> +       during the initial data copy.
> +      </para></entry>
>
> Is it important to stress on 'always' in the above two descriptions?

No, removed.

>
> 5.
> The current description of first/last_error_time seems sliglthy
> misleading as one can interpret that these are about different errors.
> Let's slightly change the description of first/last_error_time as
> follows or something on those lines:
>
> </para>
> +      <para>
> +       Time at which the first error occurred
> +      </para></entry>
> +     </row>
>
> First time at which this error occurred
>
> <structfield>last_error_time</structfield> <type>timestamp with time zone</type>
> +      </para>
> +      <para>
> +       Time at which the last error occurred
>
> Last time at which this error occurred. This will be the same as
> first_error_time except when the same error occurred more than once
> consecutively.

Changed. I've removed first_error_time as per discussion on the thread
for adding xact stats.

>
> 6.
> +        </indexterm>
> +        <function>pg_stat_reset_subscription_worker</function> (
> <parameter>subid</parameter> <type>oid</type>, <optional>
> <parameter>relid</parameter> <type>oid</type> </optional> )
> +        <returnvalue>void</returnvalue>
> +       </para>
> +       <para>
> +        Resets the statistics of a single subscription worker running on the
> +        subscription with <parameter>subid</parameter> shown in the
> +        <structname>pg_stat_subscription_worker</structname> view.  If the
> +        argument <parameter>relid</parameter> is not <literal>NULL</literal>,
> +        resets statistics of the subscription worker handling the initial data
> +        copy of the relation with <parameter>relid</parameter>.  Otherwise,
> +        resets the subscription worker statistics of the main apply worker.
> +        If the argument <parameter>relid</parameter> is omitted, resets the
> +        statistics of all subscription workers running on the subscription
> +        with <parameter>subid</parameter>.
> +       </para>
>
> The first line of this description seems to indicate that we can only
> reset the stats of a single worker but the later part indicates that
> we can reset stats of all subscription workers. Can we change the
> first line as: "Resets the statistics of subscription workers running
> on the subscription with <parameter>subid</parameter> shown in the
> <structname>pg_stat_subscription_worker</structname> view.".
>

Changed.

> 7.
> pgstat_vacuum_stat()
> {
> ..
> + pgstat_setheader(&spmsg.m_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
> + spmsg.m_databaseid = MyDatabaseId;
> + spmsg.m_nentries = 0;
> ..
> }
>
> Do we really need to set the header here? It seems to be getting set
> in pgstat_send_subscription_purge() while sending this message.

Removed.

>
> 8.
> pgstat_vacuum_stat()
> {
> ..
> +
> + if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL)
> + != NULL)
> + continue;
> +
> + /* This subscription is dead, add the subid to the message */
> + spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid;
> ..
> }
>
> I think it is better to use a separate variable here for subid as we
> are using for funcid and tableid. That will make this part of the code
> easier to follow and look consistent.

Agreed, and changed.

>
> 9.
> +/* ----------
> + * PgStat_MsgSubWorkerError Sent by the apply worker or the table
> sync worker to
> + * report the error occurred during logical replication.
> + * ----------
>
> In this comment "during logical replication" sounds too generic. Can
> we instead use "while processing changes." or something like that to
> make it a bit more specific?

"while processing changes" sounds good.

I've attached an updated version patch. Unless I miss something, all
comments I got so far have been incorporated into this patch. Please
review it.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v24-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

25 ноября 2021 г., 07:57:09

On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Changed. I've removed first_error_time as per discussion on the thread
> for adding xact stats.
>

We also agreed to change the column names to start with last_error_*
[1]. Is there a reason to not make those changes? Do you think that we
can change it just before committing that patch? I thought it might be
better to do it that way now itself.

[1] - https://www.postgresql.org/message-id/CAD21AoCQ8z5goy3BCqfk2gn5p8NVH5B-uxO3Xc-dXN-MXVfnKg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

25 ноября 2021 г., 13:36:10

On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Right. I've fixed this issue and attached an updated patch.

One very minor comment:
conflict can be moved to next line to keep it within 80 chars boundary
wherever possible
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that
will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(

Similarly in the below:
+# Insert more data to test_tab1, raising an error on the subscriber
due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(

The rest of the patch looks good.

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

25 ноября 2021 г., 15:08:14

On Wed, Nov 24, 2021 at 10:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached an updated version patch. Unless I miss something, all
> comments I got so far have been incorporated into this patch. Please
> review it.
>

Only a couple of minor points:

src/backend/postmaster/pgstat.c
(1) pgstat_get_subworker_entry

In the following comment, it should say "returns an entry ...":

+ * apply worker otherwise returns entry of the table sync worker associated

src/include/pgstat.h
(2) typedef struct PgStat_StatDBEntry

"subworker" should be "subworkers" in the following comment, to match
the struct member name:

* subworker is the hash table of PgStat_StatSubWorkerEntry which stores

Otherwise, the patch LGTM.

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

25 ноября 2021 г., 15:29:12

On Thu, Nov 25, 2021 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Changed. I've removed first_error_time as per discussion on the thread
> > for adding xact stats.
> >
>
> We also agreed to change the column names to start with last_error_*
> [1]. Is there a reason to not make those changes? Do you think that we
> can change it just before committing that patch? I thought it might be
> better to do it that way now itself.

Oh, I thought that you think that we change the column names when
adding xact stats to the view. But these names also make sense even
without the xact stats. I've attached an updated patch. It also
incorporated comments from Vignesh and Greg.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v25-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

25 ноября 2021 г., 15:29:45

On Thu, Nov 25, 2021 at 7:36 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Right. I've fixed this issue and attached an updated patch.
>
> One very minor comment:
> conflict can be moved to next line to keep it within 80 chars boundary
> wherever possible
> +# Initial table setup on both publisher and subscriber. On subscriber we create
> +# the same tables but with primary keys. Also, insert some data that
> will conflict
> +# with the data replicated from publisher later.
> +$node_publisher->safe_psql(
>
> Similarly in the below:
> +# Insert more data to test_tab1, raising an error on the subscriber
> due to violation
> +# of the unique constraint on test_tab1.
> +my $xid = $node_publisher->safe_psql(
>
> The rest of the patch looks good.

Thank you for the comments! These are incorporated into v25 patch I
just submitted.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

25 ноября 2021 г., 15:30:11

On Thu, Nov 25, 2021 at 9:08 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Wed, Nov 24, 2021 at 10:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated version patch. Unless I miss something, all
> > comments I got so far have been incorporated into this patch. Please
> > review it.
> >
>
> Only a couple of minor points:
>
> src/backend/postmaster/pgstat.c
> (1) pgstat_get_subworker_entry
>
> In the following comment, it should say "returns an entry ...":
>
> + * apply worker otherwise returns entry of the table sync worker associated
>
> src/include/pgstat.h
> (2) typedef struct PgStat_StatDBEntry
>
> "subworker" should be "subworkers" in the following comment, to match
> the struct member name:
>
> * subworker is the hash table of PgStat_StatSubWorkerEntry which stores
>
> Otherwise, the patch LGTM.

Thank you for the comments! These are incorporated into v25 patch I
just submitted.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: Skipping logical replication transactions on subscriber side

От

"houzj.fnst@fujitsu.com"

Дата:

25 ноября 2021 г., 16:05:43

On Thur, Nov 25, 2021 8:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Thu, Nov 25, 2021 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada
> <sawada.mshk@gmail.com> wrote:
> > >
> > > Changed. I've removed first_error_time as per discussion on the
> > > thread for adding xact stats.
> > >
> >
> > We also agreed to change the column names to start with last_error_*
> > [1]. Is there a reason to not make those changes? Do you think that we
> > can change it just before committing that patch? I thought it might be
> > better to do it that way now itself.
> 
> Oh, I thought that you think that we change the column names when adding xact
> stats to the view. But these names also make sense even without the xact stats.
> I've attached an updated patch. It also incorporated comments from Vignesh
> and Greg.
> 
Hi,

I only noticed some minor things in the testcases

1)
+$node_publisher->append_conf('postgresql.conf',
+                 qq[
+logical_decoding_work_mem = 64kB
+]);

It seems we don’t need set the decode_work_mem since we don't test streaming ?

2)
+$node_publisher->safe_psql('postgres',
+               q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+]);

There are a few places where only one command exists in the 'q[' or 'qq[' like the above code.
To be consistent, I think it might be better to remove the wrap here, maybe we can write like:
$node_publisher->safe_psql('postgres',
    ' CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;');

The others LGTM.

Best regards,
Hou zj

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

26 ноября 2021 г., 03:29:39

On Thu, Nov 25, 2021 at 10:06 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Thur, Nov 25, 2021 8:29 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > On Thu, Nov 25, 2021 at 1:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada
> > <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Changed. I've removed first_error_time as per discussion on the
> > > > thread for adding xact stats.
> > > >
> > >
> > > We also agreed to change the column names to start with last_error_*
> > > [1]. Is there a reason to not make those changes? Do you think that we
> > > can change it just before committing that patch? I thought it might be
> > > better to do it that way now itself.
> >
> > Oh, I thought that you think that we change the column names when adding xact
> > stats to the view. But these names also make sense even without the xact stats.
> > I've attached an updated patch. It also incorporated comments from Vignesh
> > and Greg.
> >
> Hi,
>
> I only noticed some minor things in the testcases
>
> 1)
> +$node_publisher->append_conf('postgresql.conf',
> +                            qq[
> +logical_decoding_work_mem = 64kB
> +]);
>
> It seems we don’t need set the decode_work_mem since we don't test streaming ?
>
> 2)
> +$node_publisher->safe_psql('postgres',
> +                          q[
> +CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
> +]);
>
> There are a few places where only one command exists in the 'q[' or 'qq[' like the above code.
> To be consistent, I think it might be better to remove the wrap here, maybe we can write like:
> $node_publisher->safe_psql('postgres',
>         ' CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;');
>

Indeed. Attached an updated patch. Thanks!

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v26-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch

RE: Skipping logical replication transactions on subscriber side

От

"tanghy.fnst@fujitsu.com"

Дата:

26 ноября 2021 г., 05:15:06

On Friday, November 26, 2021 9:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> Indeed. Attached an updated patch. Thanks!

Thanks for your patch. A small comment:

+       OID of the relation that the worker is synchronizing; null for the
+       main apply worker

Should we modify it to "OID of the relation that the worker was synchronizing ..."?

The rest of the patch LGTM.

Regards
Tang

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

27 ноября 2021 г., 13:56:40

On Fri, Nov 26, 2021 at 6:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Indeed. Attached an updated patch. Thanks!
>

I have made a number of changes in the attached patch which includes
(a) the patch was trying to register multiple array entries for the
same subscription which doesn't seem to be required, see changes in
pgstat_vacuum_stat, (b) multiple changes in the test like reduced the
wal_retrieve_retry_interval to 2s which has reduced the test time to
half, remove the check related to resetting of stats as there is no
guarantee that the message will be received by the collector and we
were not sending it again, changed the test case file name to
026_stats as we can add more subscription-related stats in this test
file itself (c) added/edited multiple comments, (d) updated
PGSTAT_FILE_FORMAT_ID.

Do let me know what you think of the attached?

-- 
With Regards,
Amit Kapila.

Вложения

v27-0001-Add-a-view-to-show-the-stats-of-subscription-wor.patch

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

27 ноября 2021 г., 13:58:16

On Fri, Nov 26, 2021 at 7:45 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Friday, November 26, 2021 9:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Indeed. Attached an updated patch. Thanks!
>
> Thanks for your patch. A small comment:
>
> +       OID of the relation that the worker is synchronizing; null for the
> +       main apply worker
>
> Should we modify it to "OID of the relation that the worker was synchronizing ..."?
>

I don't think this change is required, see the description of the
similar column in pg_stat_subscription.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

29 ноября 2021 г., 04:42:22

On Sat, Nov 27, 2021 at 7:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Nov 26, 2021 at 6:00 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Indeed. Attached an updated patch. Thanks!
> >
>

Thank you for updating the patch!

> I have made a number of changes in the attached patch which includes
> (a) the patch was trying to register multiple array entries for the
> same subscription which doesn't seem to be required, see changes in
> pgstat_vacuum_stat, (b) multiple changes in the test like reduced the
> wal_retrieve_retry_interval to 2s which has reduced the test time to
> half, remove the check related to resetting of stats as there is no
> guarantee that the message will be received by the collector and we
> were not sending it again, changed the test case file name to
> 026_stats as we can add more subscription-related stats in this test
> file itself

Since we have pg_stat_subscription view, how about 026_worker_stats.pl?

The rests look good to me.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

29 ноября 2021 г., 06:43:34

On Mon, Nov 29, 2021 at 7:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Nov 27, 2021 at 7:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> Thank you for updating the patch!
>
> > I have made a number of changes in the attached patch which includes
> > (a) the patch was trying to register multiple array entries for the
> > same subscription which doesn't seem to be required, see changes in
> > pgstat_vacuum_stat, (b) multiple changes in the test like reduced the
> > wal_retrieve_retry_interval to 2s which has reduced the test time to
> > half, remove the check related to resetting of stats as there is no
> > guarantee that the message will be received by the collector and we
> > were not sending it again, changed the test case file name to
> > 026_stats as we can add more subscription-related stats in this test
> > file itself
>
> Since we have pg_stat_subscription view, how about 026_worker_stats.pl?
>

Sounds better. Updated patch attached.

> The rests look good to me.
>

Okay, I'll push this patch tomorrow unless there are more comments.

-- 
With Regards,
Amit Kapila.

Вложения

v28-0001-Add-a-view-to-show-the-stats-of-subscription-wor.patch

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

29 ноября 2021 г., 09:07:50

On Mon, Nov 29, 2021 at 9:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Nov 29, 2021 at 7:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, Nov 27, 2021 at 7:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > Thank you for updating the patch!
> >
> > > I have made a number of changes in the attached patch which includes
> > > (a) the patch was trying to register multiple array entries for the
> > > same subscription which doesn't seem to be required, see changes in
> > > pgstat_vacuum_stat, (b) multiple changes in the test like reduced the
> > > wal_retrieve_retry_interval to 2s which has reduced the test time to
> > > half, remove the check related to resetting of stats as there is no
> > > guarantee that the message will be received by the collector and we
> > > were not sending it again, changed the test case file name to
> > > 026_stats as we can add more subscription-related stats in this test
> > > file itself
> >
> > Since we have pg_stat_subscription view, how about 026_worker_stats.pl?
> >
>
> Sounds better. Updated patch attached.

Thanks for the updated patch, the v28 patch looks good to me.

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

30 ноября 2021 г., 12:28:01

On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com> wrote:
>

I have pushed this patch and there is a buildfarm failure for it. See:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2021-11-30%2005%3A05%3A25

Sawada-San has shared his initial analysis on pgsql-committers [1] and
I am responding here as the fix requires some more discussion.

> Looking at the result the test actually got, we had two error entries
> for test_tab1 instead of one:
>
> #   Failed test 'check the error reported by the apply worker'
> #   at t/026_worker_stats.pl line 33.
> #          got: 'tap_sub|INSERT|test_tab1|t
> # tap_sub||test_tab1|t'
> #     expected: 'tap_sub|INSERT|test_tab1|t'
>
> The possible scenarios are:
>
> The table sync worker for test_tab1 failed due to an error unrelated
> to apply changes:
>
> 2021-11-30 06:24:02.137 CET [18990:2] ERROR:  replication origin with
> OID 2 is already active for PID 23706
>
> At this time, the view had one error entry for the table sync worker.
> After retrying table sync, it succeeded:
>
> 2021-11-30 06:24:04.202 CET [28117:2] LOG:  logical replication table
> synchronization worker for subscription "tap_sub", table "test_tab1"
> has finished
>
> Then after inserting a row on the publisher, the apply worker inserted
> the row but failed due to violating a unique key violation, which is
> expected:
>
> 2021-11-30 06:24:04.307 CET [4806:2] ERROR:  duplicate key value
> violates unique constraint "test_tab1_pkey"
> 2021-11-30 06:24:04.307 CET [4806:3] DETAIL:  Key (a)=(1) already exists.
> 2021-11-30 06:24:04.307 CET [4806:4] CONTEXT:  processing remote data
> during "INSERT" for replication target relation "public.test_tab1" in
> transaction 721 at 2021-11-30 06:24:04.305096+01
>
> As a result, we had two error entries for test_tab1: the table sync
> worker error and the apply worker error. I didn't expect that the
> table sync worker for test_tab1 failed due to "replication origin with
> OID 2 is already active for PID 23706” error.
>
> Looking at test_subscription_error() in 026_worker_stats.pl, we have
> two checks; in the first check, we wait for the view to show the error
> entry for the given relation name and xid. This check was passed since
> we had the second error (i.g., apply worker error). In the second
> check, we get error entries from pg_stat_subscription_workers by
> specifying only the relation name. Therefore, we ended up getting two
> entries and failed the tests.
>
> To fix this issue, I think that in the second check, we can get the
> error from pg_stat_subscription_workers by specifying the relation
> name *and* xid like the first check does. I've attached the patch.
> What do you think?
>

I think this will fix the reported failure but there is another race
condition in the test. Isn't it possible that for table test_tab2, we
get an error "replication origin with OID ..." or some other error
before copy, in that case also, we will proceed from the second call
of test_subscription_error() which is not what we expect in the test?
Shouldn't we someway check that the error message also starts with
"duplicate key value violates ..."?

[1] -
https://www.postgresql.org/message-id/CAD21AoChP5wOT2AYziF%2B-j7vvThF2NyAs7wr%2Byy%2B8hsnu%3D8Rgg%40mail.gmail.com

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

30 ноября 2021 г., 14:41:52

On Tue, Nov 30, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com> wrote:
> >
>
> I have pushed this patch and there is a buildfarm failure for it. See:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2021-11-30%2005%3A05%3A25
>
> Sawada-San has shared his initial analysis on pgsql-committers [1] and
> I am responding here as the fix requires some more discussion.
>
> > Looking at the result the test actually got, we had two error entries
> > for test_tab1 instead of one:
> >
> > #   Failed test 'check the error reported by the apply worker'
> > #   at t/026_worker_stats.pl line 33.
> > #          got: 'tap_sub|INSERT|test_tab1|t
> > # tap_sub||test_tab1|t'
> > #     expected: 'tap_sub|INSERT|test_tab1|t'
> >
> > The possible scenarios are:
> >
> > The table sync worker for test_tab1 failed due to an error unrelated
> > to apply changes:
> >
> > 2021-11-30 06:24:02.137 CET [18990:2] ERROR:  replication origin with
> > OID 2 is already active for PID 23706
> >
> > At this time, the view had one error entry for the table sync worker.
> > After retrying table sync, it succeeded:
> >
> > 2021-11-30 06:24:04.202 CET [28117:2] LOG:  logical replication table
> > synchronization worker for subscription "tap_sub", table "test_tab1"
> > has finished
> >
> > Then after inserting a row on the publisher, the apply worker inserted
> > the row but failed due to violating a unique key violation, which is
> > expected:
> >
> > 2021-11-30 06:24:04.307 CET [4806:2] ERROR:  duplicate key value
> > violates unique constraint "test_tab1_pkey"
> > 2021-11-30 06:24:04.307 CET [4806:3] DETAIL:  Key (a)=(1) already exists.
> > 2021-11-30 06:24:04.307 CET [4806:4] CONTEXT:  processing remote data
> > during "INSERT" for replication target relation "public.test_tab1" in
> > transaction 721 at 2021-11-30 06:24:04.305096+01
> >
> > As a result, we had two error entries for test_tab1: the table sync
> > worker error and the apply worker error. I didn't expect that the
> > table sync worker for test_tab1 failed due to "replication origin with
> > OID 2 is already active for PID 23706” error.
> >
> > Looking at test_subscription_error() in 026_worker_stats.pl, we have
> > two checks; in the first check, we wait for the view to show the error
> > entry for the given relation name and xid. This check was passed since
> > we had the second error (i.g., apply worker error). In the second
> > check, we get error entries from pg_stat_subscription_workers by
> > specifying only the relation name. Therefore, we ended up getting two
> > entries and failed the tests.
> >
> > To fix this issue, I think that in the second check, we can get the
> > error from pg_stat_subscription_workers by specifying the relation
> > name *and* xid like the first check does. I've attached the patch.
> > What do you think?
> >
>
> I think this will fix the reported failure but there is another race
> condition in the test. Isn't it possible that for table test_tab2, we
> get an error "replication origin with OID ..." or some other error
> before copy, in that case also, we will proceed from the second call
> of test_subscription_error() which is not what we expect in the test?

Right.

> Shouldn't we someway check that the error message also starts with
> "duplicate key value violates ..."?

Yeah, I think it's a good idea to make the checks more specific. That
is, probably we can specify the prefix of the error message and
subrelid in addition to the current conditions: relid and xid. That
way, we can check what error was reported by which workers (tablesync
or apply) for which relations. And both check queries in
test_subscription_error() can have the same WHERE clause.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

30 ноября 2021 г., 16:38:40

On Tue, Nov 30, 2021 at 8:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Nov 30, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com> wrote:
> > >
> >
> > I have pushed this patch and there is a buildfarm failure for it. See:
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2021-11-30%2005%3A05%3A25
> >
> > Sawada-San has shared his initial analysis on pgsql-committers [1] and
> > I am responding here as the fix requires some more discussion.
> >
> > > Looking at the result the test actually got, we had two error entries
> > > for test_tab1 instead of one:
> > >
> > > #   Failed test 'check the error reported by the apply worker'
> > > #   at t/026_worker_stats.pl line 33.
> > > #          got: 'tap_sub|INSERT|test_tab1|t
> > > # tap_sub||test_tab1|t'
> > > #     expected: 'tap_sub|INSERT|test_tab1|t'
> > >
> > > The possible scenarios are:
> > >
> > > The table sync worker for test_tab1 failed due to an error unrelated
> > > to apply changes:
> > >
> > > 2021-11-30 06:24:02.137 CET [18990:2] ERROR:  replication origin with
> > > OID 2 is already active for PID 23706
> > >
> > > At this time, the view had one error entry for the table sync worker.
> > > After retrying table sync, it succeeded:
> > >
> > > 2021-11-30 06:24:04.202 CET [28117:2] LOG:  logical replication table
> > > synchronization worker for subscription "tap_sub", table "test_tab1"
> > > has finished
> > >
> > > Then after inserting a row on the publisher, the apply worker inserted
> > > the row but failed due to violating a unique key violation, which is
> > > expected:
> > >
> > > 2021-11-30 06:24:04.307 CET [4806:2] ERROR:  duplicate key value
> > > violates unique constraint "test_tab1_pkey"
> > > 2021-11-30 06:24:04.307 CET [4806:3] DETAIL:  Key (a)=(1) already exists.
> > > 2021-11-30 06:24:04.307 CET [4806:4] CONTEXT:  processing remote data
> > > during "INSERT" for replication target relation "public.test_tab1" in
> > > transaction 721 at 2021-11-30 06:24:04.305096+01
> > >
> > > As a result, we had two error entries for test_tab1: the table sync
> > > worker error and the apply worker error. I didn't expect that the
> > > table sync worker for test_tab1 failed due to "replication origin with
> > > OID 2 is already active for PID 23706” error.
> > >
> > > Looking at test_subscription_error() in 026_worker_stats.pl, we have
> > > two checks; in the first check, we wait for the view to show the error
> > > entry for the given relation name and xid. This check was passed since
> > > we had the second error (i.g., apply worker error). In the second
> > > check, we get error entries from pg_stat_subscription_workers by
> > > specifying only the relation name. Therefore, we ended up getting two
> > > entries and failed the tests.
> > >
> > > To fix this issue, I think that in the second check, we can get the
> > > error from pg_stat_subscription_workers by specifying the relation
> > > name *and* xid like the first check does. I've attached the patch.
> > > What do you think?
> > >
> >
> > I think this will fix the reported failure but there is another race
> > condition in the test. Isn't it possible that for table test_tab2, we
> > get an error "replication origin with OID ..." or some other error
> > before copy, in that case also, we will proceed from the second call
> > of test_subscription_error() which is not what we expect in the test?
>
> Right.
>
> > Shouldn't we someway check that the error message also starts with
> > "duplicate key value violates ..."?
>
> Yeah, I think it's a good idea to make the checks more specific. That
> is, probably we can specify the prefix of the error message and
> subrelid in addition to the current conditions: relid and xid. That
> way, we can check what error was reported by which workers (tablesync
> or apply) for which relations. And both check queries in
> test_subscription_error() can have the same WHERE clause.

I've attached a patch that fixes this issue. Please review it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

0001-Fix-regression-test-failure-caused-by-commit-8d74fc9.patch

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

30 ноября 2021 г., 19:44:06

On Tue, Nov 30, 2021 at 7:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Nov 30, 2021 at 8:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Nov 30, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > >
> > > I have pushed this patch and there is a buildfarm failure for it. See:
> > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2021-11-30%2005%3A05%3A25
> > >
> > > Sawada-San has shared his initial analysis on pgsql-committers [1] and
> > > I am responding here as the fix requires some more discussion.
> > >
> > > > Looking at the result the test actually got, we had two error entries
> > > > for test_tab1 instead of one:
> > > >
> > > > #   Failed test 'check the error reported by the apply worker'
> > > > #   at t/026_worker_stats.pl line 33.
> > > > #          got: 'tap_sub|INSERT|test_tab1|t
> > > > # tap_sub||test_tab1|t'
> > > > #     expected: 'tap_sub|INSERT|test_tab1|t'
> > > >
> > > > The possible scenarios are:
> > > >
> > > > The table sync worker for test_tab1 failed due to an error unrelated
> > > > to apply changes:
> > > >
> > > > 2021-11-30 06:24:02.137 CET [18990:2] ERROR:  replication origin with
> > > > OID 2 is already active for PID 23706
> > > >
> > > > At this time, the view had one error entry for the table sync worker.
> > > > After retrying table sync, it succeeded:
> > > >
> > > > 2021-11-30 06:24:04.202 CET [28117:2] LOG:  logical replication table
> > > > synchronization worker for subscription "tap_sub", table "test_tab1"
> > > > has finished
> > > >
> > > > Then after inserting a row on the publisher, the apply worker inserted
> > > > the row but failed due to violating a unique key violation, which is
> > > > expected:
> > > >
> > > > 2021-11-30 06:24:04.307 CET [4806:2] ERROR:  duplicate key value
> > > > violates unique constraint "test_tab1_pkey"
> > > > 2021-11-30 06:24:04.307 CET [4806:3] DETAIL:  Key (a)=(1) already exists.
> > > > 2021-11-30 06:24:04.307 CET [4806:4] CONTEXT:  processing remote data
> > > > during "INSERT" for replication target relation "public.test_tab1" in
> > > > transaction 721 at 2021-11-30 06:24:04.305096+01
> > > >
> > > > As a result, we had two error entries for test_tab1: the table sync
> > > > worker error and the apply worker error. I didn't expect that the
> > > > table sync worker for test_tab1 failed due to "replication origin with
> > > > OID 2 is already active for PID 23706” error.
> > > >
> > > > Looking at test_subscription_error() in 026_worker_stats.pl, we have
> > > > two checks; in the first check, we wait for the view to show the error
> > > > entry for the given relation name and xid. This check was passed since
> > > > we had the second error (i.g., apply worker error). In the second
> > > > check, we get error entries from pg_stat_subscription_workers by
> > > > specifying only the relation name. Therefore, we ended up getting two
> > > > entries and failed the tests.
> > > >
> > > > To fix this issue, I think that in the second check, we can get the
> > > > error from pg_stat_subscription_workers by specifying the relation
> > > > name *and* xid like the first check does. I've attached the patch.
> > > > What do you think?
> > > >
> > >
> > > I think this will fix the reported failure but there is another race
> > > condition in the test. Isn't it possible that for table test_tab2, we
> > > get an error "replication origin with OID ..." or some other error
> > > before copy, in that case also, we will proceed from the second call
> > > of test_subscription_error() which is not what we expect in the test?
> >
> > Right.
> >
> > > Shouldn't we someway check that the error message also starts with
> > > "duplicate key value violates ..."?
> >
> > Yeah, I think it's a good idea to make the checks more specific. That
> > is, probably we can specify the prefix of the error message and
> > subrelid in addition to the current conditions: relid and xid. That
> > way, we can check what error was reported by which workers (tablesync
> > or apply) for which relations. And both check queries in
> > test_subscription_error() can have the same WHERE clause.
>
> I've attached a patch that fixes this issue. Please review it.

Thanks for the updated patch, the patch applies neatly and make
check-world passes. Also I ran the failing test in a loop and found it
to be passing always.

Regards,
Vignesh

RE: Skipping logical replication transactions on subscriber side

От

"houzj.fnst@fujitsu.com"

Дата:

01 декабря 2021 г., 05:53:53

On Tues, Nov 30, 2021 9:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Tue, Nov 30, 2021 at 8:41 PM Masahiko Sawada <sawada.mshk@gmail.com>
> wrote:
> >
> > On Tue, Nov 30, 2021 at 6:28 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > >
> > > On Mon, Nov 29, 2021 at 11:38 AM vignesh C <vignesh21@gmail.com>
> wrote:
> > > >
> > >
> > > I have pushed this patch and there is a buildfarm failure for it. See:
> > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&d
> > > t=2021-11-30%2005%3A05%3A25
> > >
> > > Sawada-San has shared his initial analysis on pgsql-committers [1]
> > > and I am responding here as the fix requires some more discussion.
> > >
> > > > Looking at the result the test actually got, we had two error
> > > > entries for test_tab1 instead of one:
> > > >
> > > > #   Failed test 'check the error reported by the apply worker'
> > > > #   at t/026_worker_stats.pl line 33.
> > > > #          got: 'tap_sub|INSERT|test_tab1|t
> > > > # tap_sub||test_tab1|t'
> > > > #     expected: 'tap_sub|INSERT|test_tab1|t'
> > > >
> > > > The possible scenarios are:
> > > >
> > > > The table sync worker for test_tab1 failed due to an error
> > > > unrelated to apply changes:
> > > >
> > > > 2021-11-30 06:24:02.137 CET [18990:2] ERROR:  replication origin
> > > > with OID 2 is already active for PID 23706
> > > >
> > > > At this time, the view had one error entry for the table sync worker.
> > > > After retrying table sync, it succeeded:
> > > >
> > > > 2021-11-30 06:24:04.202 CET [28117:2] LOG:  logical replication
> > > > table synchronization worker for subscription "tap_sub", table
> "test_tab1"
> > > > has finished
> > > >
> > > > Then after inserting a row on the publisher, the apply worker
> > > > inserted the row but failed due to violating a unique key
> > > > violation, which is
> > > > expected:
> > > >
> > > > 2021-11-30 06:24:04.307 CET [4806:2] ERROR:  duplicate key value
> > > > violates unique constraint "test_tab1_pkey"
> > > > 2021-11-30 06:24:04.307 CET [4806:3] DETAIL:  Key (a)=(1) already exists.
> > > > 2021-11-30 06:24:04.307 CET [4806:4] CONTEXT:  processing remote
> > > > data during "INSERT" for replication target relation
> > > > "public.test_tab1" in transaction 721 at 2021-11-30
> > > > 06:24:04.305096+01
> > > >
> > > > As a result, we had two error entries for test_tab1: the table
> > > > sync worker error and the apply worker error. I didn't expect that
> > > > the table sync worker for test_tab1 failed due to "replication
> > > > origin with OID 2 is already active for PID 23706” error.
> > > >
> > > > Looking at test_subscription_error() in 026_worker_stats.pl, we
> > > > have two checks; in the first check, we wait for the view to show
> > > > the error entry for the given relation name and xid. This check
> > > > was passed since we had the second error (i.g., apply worker
> > > > error). In the second check, we get error entries from
> > > > pg_stat_subscription_workers by specifying only the relation name.
> > > > Therefore, we ended up getting two entries and failed the tests.
> > > >
> > > > To fix this issue, I think that in the second check, we can get
> > > > the error from pg_stat_subscription_workers by specifying the
> > > > relation name *and* xid like the first check does. I've attached the patch.
> > > > What do you think?
> > > >
> > >
> > > I think this will fix the reported failure but there is another race
> > > condition in the test. Isn't it possible that for table test_tab2,
> > > we get an error "replication origin with OID ..." or some other
> > > error before copy, in that case also, we will proceed from the
> > > second call of test_subscription_error() which is not what we expect in the
> test?
> >
> > Right.
> >
> > > Shouldn't we someway check that the error message also starts with
> > > "duplicate key value violates ..."?
> >
> > Yeah, I think it's a good idea to make the checks more specific. That
> > is, probably we can specify the prefix of the error message and
> > subrelid in addition to the current conditions: relid and xid. That
> > way, we can check what error was reported by which workers (tablesync
> > or apply) for which relations. And both check queries in
> > test_subscription_error() can have the same WHERE clause.
> 
> I've attached a patch that fixes this issue. Please review it.
> 

I have a question about the testcase (I could be wrong here).

Is it possible that the race condition happen between apply worker(test_tab1)
and table sync worker(test_tab2) ? If so, it seems the error("replication
origin with OID") could happen randomly until we resolve the conflict.
Based on this, for the following code:
-----
    # Wait for the error statistics to be updated.
    my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
    $node->poll_query_until(
    'postgres', $check_sql,
) or die "Timed out while waiting for statistics to be updated";

* [1] *

    $check_sql =
    qq[
SELECT subname, last_error_command, last_error_relid::regclass,
last_error_count > 0 ] . $part_sql;
    my $result = $node->safe_psql('postgres', $check_sql);
    is($result, $expected, $msg);
-----

Is it possible that the error("replication origin with OID") happen again at the
place [1]. In this case, the error message we have checked could be replaced by
another error("replication origin ...") and then the test fail ?

Best regards,
Hou zj

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

01 декабря 2021 г., 06:22:20

On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Tues, Nov 30, 2021 9:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > > Shouldn't we someway check that the error message also starts with
> > > > "duplicate key value violates ..."?
> > >
> > > Yeah, I think it's a good idea to make the checks more specific. That
> > > is, probably we can specify the prefix of the error message and
> > > subrelid in addition to the current conditions: relid and xid. That
> > > way, we can check what error was reported by which workers (tablesync
> > > or apply) for which relations. And both check queries in
> > > test_subscription_error() can have the same WHERE clause.
> >
> > I've attached a patch that fixes this issue. Please review it.
> >
>
> I have a question about the testcase (I could be wrong here).
>
> Is it possible that the race condition happen between apply worker(test_tab1)
> and table sync worker(test_tab2) ? If so, it seems the error("replication
> origin with OID") could happen randomly until we resolve the conflict.
> Based on this, for the following code:
> -----
>     # Wait for the error statistics to be updated.
>     my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
>     $node->poll_query_until(
>         'postgres', $check_sql,
> ) or die "Timed out while waiting for statistics to be updated";
>
> * [1] *
>
>     $check_sql =
>         qq[
> SELECT subname, last_error_command, last_error_relid::regclass,
> last_error_count > 0 ] . $part_sql;
>     my $result = $node->safe_psql('postgres', $check_sql);
>     is($result, $expected, $msg);
> -----
>
> Is it possible that the error("replication origin with OID") happen again at the
> place [1]. In this case, the error message we have checked could be replaced by
> another error("replication origin ...") and then the test fail ?
>

Once we get the "duplicate key violation ..." error before * [1] * via
apply_worker then we shouldn't get replication origin-specific error
because the origin set up is done before starting to apply changes.
Also, even if that or some other happens after * [1] * because of
errmsg_prefix check it should still succeed. Does that make sense?

-- 
With Regards,
Amit Kapila.

RE: Skipping logical replication transactions on subscriber side

От

"houzj.fnst@fujitsu.com"

Дата:

01 декабря 2021 г., 06:39:20

On Wed, Dec 1, 2021 11:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tues, Nov 30, 2021 9:39 PM Masahiko Sawada
> <sawada.mshk@gmail.com> wrote:
> > > >
> > > > > Shouldn't we someway check that the error message also starts with
> > > > > "duplicate key value violates ..."?
> > > >
> > > > Yeah, I think it's a good idea to make the checks more specific. That
> > > > is, probably we can specify the prefix of the error message and
> > > > subrelid in addition to the current conditions: relid and xid. That
> > > > way, we can check what error was reported by which workers (tablesync
> > > > or apply) for which relations. And both check queries in
> > > > test_subscription_error() can have the same WHERE clause.
> > >
> > > I've attached a patch that fixes this issue. Please review it.
> > >
> >
> > I have a question about the testcase (I could be wrong here).
> >
> > Is it possible that the race condition happen between apply
> worker(test_tab1)
> > and table sync worker(test_tab2) ? If so, it seems the error("replication
> > origin with OID") could happen randomly until we resolve the conflict.
> > Based on this, for the following code:
> > -----
> >     # Wait for the error statistics to be updated.
> >     my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
> >     $node->poll_query_until(
> >         'postgres', $check_sql,
> > ) or die "Timed out while waiting for statistics to be updated";
> >
> > * [1] *
> >
> >     $check_sql =
> >         qq[
> > SELECT subname, last_error_command, last_error_relid::regclass,
> > last_error_count > 0 ] . $part_sql;
> >     my $result = $node->safe_psql('postgres', $check_sql);
> >     is($result, $expected, $msg);
> > -----
> >
> > Is it possible that the error("replication origin with OID") happen again at the
> > place [1]. In this case, the error message we have checked could be replaced
> by
> > another error("replication origin ...") and then the test fail ?
> >
> 
> Once we get the "duplicate key violation ..." error before * [1] * via
> apply_worker then we shouldn't get replication origin-specific error
> because the origin set up is done before starting to apply changes.
> Also, even if that or some other happens after * [1] * because of
> errmsg_prefix check it should still succeed. Does that make sense?

Oh, I missed the point that the origin set up is done once we get the expected error.
Thanks for the explanation, and I think the patch looks good.

Best regards,
Hou zj

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

01 декабря 2021 г., 06:41:31

On Wed, Dec 1, 2021 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Tues, Nov 30, 2021 9:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > > Shouldn't we someway check that the error message also starts with
> > > > > "duplicate key value violates ..."?
> > > >
> > > > Yeah, I think it's a good idea to make the checks more specific. That
> > > > is, probably we can specify the prefix of the error message and
> > > > subrelid in addition to the current conditions: relid and xid. That
> > > > way, we can check what error was reported by which workers (tablesync
> > > > or apply) for which relations. And both check queries in
> > > > test_subscription_error() can have the same WHERE clause.
> > >
> > > I've attached a patch that fixes this issue. Please review it.
> > >
> >
> > I have a question about the testcase (I could be wrong here).
> >
> > Is it possible that the race condition happen between apply worker(test_tab1)
> > and table sync worker(test_tab2) ? If so, it seems the error("replication
> > origin with OID") could happen randomly until we resolve the conflict.
> > Based on this, for the following code:
> > -----
> >     # Wait for the error statistics to be updated.
> >     my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
> >     $node->poll_query_until(
> >         'postgres', $check_sql,
> > ) or die "Timed out while waiting for statistics to be updated";
> >
> > * [1] *
> >
> >     $check_sql =
> >         qq[
> > SELECT subname, last_error_command, last_error_relid::regclass,
> > last_error_count > 0 ] . $part_sql;
> >     my $result = $node->safe_psql('postgres', $check_sql);
> >     is($result, $expected, $msg);
> > -----
> >
> > Is it possible that the error("replication origin with OID") happen again at the
> > place [1]. In this case, the error message we have checked could be replaced by
> > another error("replication origin ...") and then the test fail ?
> >
>
> Once we get the "duplicate key violation ..." error before * [1] * via
> apply_worker then we shouldn't get replication origin-specific error
> because the origin set up is done before starting to apply changes.

Right.

> Also, even if that or some other happens after * [1] * because of
> errmsg_prefix check it should still succeed.

In this case, the old error ("duplicate key violation ...") is
overwritten by a new error (e.g., connection error. not sure how
possible it is) and the test fails because the query returns no
entries, no? If so, the result from the second check_sql is unstable
and it's probably better to check the result only once. That is, the
first check_sql includes the command and we exit from the function
once we confirm the error entry is expectedly updated.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

01 декабря 2021 г., 07:00:18

On Wed, Dec 1, 2021 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Dec 1, 2021 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > I have a question about the testcase (I could be wrong here).
> > >
> > > Is it possible that the race condition happen between apply worker(test_tab1)
> > > and table sync worker(test_tab2) ? If so, it seems the error("replication
> > > origin with OID") could happen randomly until we resolve the conflict.
> > > Based on this, for the following code:
> > > -----
> > >     # Wait for the error statistics to be updated.
> > >     my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
> > >     $node->poll_query_until(
> > >         'postgres', $check_sql,
> > > ) or die "Timed out while waiting for statistics to be updated";
> > >
> > > * [1] *
> > >
> > >     $check_sql =
> > >         qq[
> > > SELECT subname, last_error_command, last_error_relid::regclass,
> > > last_error_count > 0 ] . $part_sql;
> > >     my $result = $node->safe_psql('postgres', $check_sql);
> > >     is($result, $expected, $msg);
> > > -----
> > >
> > > Is it possible that the error("replication origin with OID") happen again at the
> > > place [1]. In this case, the error message we have checked could be replaced by
> > > another error("replication origin ...") and then the test fail ?
> > >
> >
> > Once we get the "duplicate key violation ..." error before * [1] * via
> > apply_worker then we shouldn't get replication origin-specific error
> > because the origin set up is done before starting to apply changes.
>
> Right.
>
> > Also, even if that or some other happens after * [1] * because of
> > errmsg_prefix check it should still succeed.
>
> In this case, the old error ("duplicate key violation ...") is
> overwritten by a new error (e.g., connection error. not sure how
> possible it is)
>

Yeah, or probably some memory allocation failure. I think the
probability of such failures is very low but OTOH why take chance.

> and the test fails because the query returns no
> entries, no?
>

Right.

> If so, the result from the second check_sql is unstable
> and it's probably better to check the result only once. That is, the
> first check_sql includes the command and we exit from the function
> once we confirm the error entry is expectedly updated.
>

Yeah, I think that should be fine.

With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

01 декабря 2021 г., 08:23:29

On Wed, Dec 1, 2021 at 1:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 1, 2021 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Dec 1, 2021 at 12:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Dec 1, 2021 at 8:24 AM houzj.fnst@fujitsu.com
> > > <houzj.fnst@fujitsu.com> wrote:
> > > >
> > > > I have a question about the testcase (I could be wrong here).
> > > >
> > > > Is it possible that the race condition happen between apply worker(test_tab1)
> > > > and table sync worker(test_tab2) ? If so, it seems the error("replication
> > > > origin with OID") could happen randomly until we resolve the conflict.
> > > > Based on this, for the following code:
> > > > -----
> > > >     # Wait for the error statistics to be updated.
> > > >     my $check_sql = qq[SELECT count(1) > 0 ] . $part_sql;
> > > >     $node->poll_query_until(
> > > >         'postgres', $check_sql,
> > > > ) or die "Timed out while waiting for statistics to be updated";
> > > >
> > > > * [1] *
> > > >
> > > >     $check_sql =
> > > >         qq[
> > > > SELECT subname, last_error_command, last_error_relid::regclass,
> > > > last_error_count > 0 ] . $part_sql;
> > > >     my $result = $node->safe_psql('postgres', $check_sql);
> > > >     is($result, $expected, $msg);
> > > > -----
> > > >
> > > > Is it possible that the error("replication origin with OID") happen again at the
> > > > place [1]. In this case, the error message we have checked could be replaced by
> > > > another error("replication origin ...") and then the test fail ?
> > > >
> > >
> > > Once we get the "duplicate key violation ..." error before * [1] * via
> > > apply_worker then we shouldn't get replication origin-specific error
> > > because the origin set up is done before starting to apply changes.
> >
> > Right.
> >
> > > Also, even if that or some other happens after * [1] * because of
> > > errmsg_prefix check it should still succeed.
> >
> > In this case, the old error ("duplicate key violation ...") is
> > overwritten by a new error (e.g., connection error. not sure how
> > possible it is)
> >
>
> Yeah, or probably some memory allocation failure. I think the
> probability of such failures is very low but OTOH why take chance.
>
> > and the test fails because the query returns no
> > entries, no?
> >
>
> Right.
>
> > If so, the result from the second check_sql is unstable
> > and it's probably better to check the result only once. That is, the
> > first check_sql includes the command and we exit from the function
> > once we confirm the error entry is expectedly updated.
> >
>
> Yeah, I think that should be fine.

Okay, I've attached an updated patch. Please review it.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v2-0001-Fix-regression-test-failure-caused-by-commit-8d74.patch

RE: Skipping logical replication transactions on subscriber side

От

"houzj.fnst@fujitsu.com"

Дата:

01 декабря 2021 г., 09:27:33

On Wednesday, December 1, 2021 1:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Wed, Dec 1, 2021 at 1:00 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Wed, Dec 1, 2021 at 9:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > If so, the result from the second check_sql is unstable and it's
> > > probably better to check the result only once. That is, the first
> > > check_sql includes the command and we exit from the function once we
> > > confirm the error entry is expectedly updated.
> > >
> >
> > Yeah, I think that should be fine.
> 
> Okay, I've attached an updated patch. Please review it.
> 

I agreed that checking the result only once makes the test more stable.
The patch looks good to me.

Best regards,
Hou zj

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

02 декабря 2021 г., 09:48:17

On Wed, Dec 1, 2021 at 11:57 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> On Wednesday, December 1, 2021 1:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Okay, I've attached an updated patch. Please review it.
> >
>
> I agreed that checking the result only once makes the test more stable.
> The patch looks good to me.
>

Pushed.

Now, coming back to the skip_xid patch. To summarize the discussion in
that regard so far, we have discussed various alternatives for the
syntax like:

a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
b. Alter Subscription <sub_name> SET ( subscription_parameter [=value]
[, ... ] );
c. Alter Subscription <sub_name> On Error ( subscription_parameter
[=value] [, ... ] );
d. Alter Subscription <sub_name> SKIP ( subscription_parameter
[=value] [, ... ] );
where subscription_parameter can be one of:
xid = <xid_val>
lsn = <lsn_val>
...

We didn't prefer (a) as it can lead to more keywords as we add more
options; (b) as we want these new skip options to behave and be set
differently than existing subscription properties because of the
difference in their behavior; (c) as that sounds more like an action
to be performed on a future condition (error/conflict) whereas here we
already knew that an error has happened;

As per discussion till now, option (d) seems preferable.  In this, we
need to see how and what to allow as options. The simplest way for the
first version is to just allow one xid to be specified at a time which
would mean that specifying multiple xids should error out. We can also
additionally allow specifying operations like 'insert', 'update',
etc., and then relation list (list of oids). What that would mean is
that for a transaction we can allow which particular operations and
relations we want to skip.

I am not sure what exactly we can provide to users to allow skipping
initial table sync as we can't specify XID there. One option that
comes to mind is to allow specifying a combination of copy_data and
relid to skip table sync for a particular relation. We might think of
not doing anything for table sync workers but not sure if that is a
good option.

Thoughts?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

02 декабря 2021 г., 18:08:02

On 02.12.21 07:48, Amit Kapila wrote:
> a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
> b. Alter Subscription <sub_name> SET ( subscription_parameter [=value]
> [, ... ] );
> c. Alter Subscription <sub_name> On Error ( subscription_parameter
> [=value] [, ... ] );
> d. Alter Subscription <sub_name> SKIP ( subscription_parameter
> [=value] [, ... ] );
> where subscription_parameter can be one of:
> xid = <xid_val>
> lsn = <lsn_val>
> ...

> As per discussion till now, option (d) seems preferable.

I agree.

> In this, we
> need to see how and what to allow as options. The simplest way for the
> first version is to just allow one xid to be specified at a time which
> would mean that specifying multiple xids should error out. We can also
> additionally allow specifying operations like 'insert', 'update',
> etc., and then relation list (list of oids). What that would mean is
> that for a transaction we can allow which particular operations and
> relations we want to skip.

I don't know how difficult it would be, but allowing multiple xids might 
be desirable.  But this syntax gives you flexibility, so we can also 
start with a simple implementation.

> I am not sure what exactly we can provide to users to allow skipping
> initial table sync as we can't specify XID there. One option that
> comes to mind is to allow specifying a combination of copy_data and
> relid to skip table sync for a particular relation. We might think of
> not doing anything for table sync workers but not sure if that is a
> good option.

I don't think this feature should affect tablesync.  The semantics are 
not clear, and it's not really needed.  If the tablesync doesn't work, 
you can try the setup again from scratch.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

03 декабря 2021 г., 05:53:25

On Thu, Dec 2, 2021 at 8:38 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 02.12.21 07:48, Amit Kapila wrote:
> > a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
> > b. Alter Subscription <sub_name> SET ( subscription_parameter [=value]
> > [, ... ] );
> > c. Alter Subscription <sub_name> On Error ( subscription_parameter
> > [=value] [, ... ] );
> > d. Alter Subscription <sub_name> SKIP ( subscription_parameter
> > [=value] [, ... ] );
> > where subscription_parameter can be one of:
> > xid = <xid_val>
> > lsn = <lsn_val>
> > ...
>
> > As per discussion till now, option (d) seems preferable.
>
> I agree.
>
> > In this, we
> > need to see how and what to allow as options. The simplest way for the
> > first version is to just allow one xid to be specified at a time which
> > would mean that specifying multiple xids should error out. We can also
> > additionally allow specifying operations like 'insert', 'update',
> > etc., and then relation list (list of oids). What that would mean is
> > that for a transaction we can allow which particular operations and
> > relations we want to skip.
>
> I don't know how difficult it would be, but allowing multiple xids might
> be desirable.
>

Are there many cases where there could be multiple xid failures that
the user can skip? Apply worker always keeps looping at the same error
failure so the user wouldn't know of the second xid failure (if any)
till the first failure is resolved. I could think of one such case
where it is possible during the initial synchronization phase where
apply worker went ahead then tablesync worker by skipping to apply the
changes on the corresponding table. After that, it is possible, that
the table sync worker failed during the catch-up phase and apply
worker fails during the processing of some other rel.

>  But this syntax gives you flexibility, so we can also
> start with a simple implementation.
>

Yeah, I also think so. BTW, what do you think of providing extra
flexibility of giving other options like 'operation', 'rel' along with
xid? I think such options could be useful for large transactions that
operate on multiple tables as it is quite possible that only a
particular operation from the entire transaction is the cause of
failure. Now, on one side, we can argue that skipping the entire
transaction is better from the consistency point of view but I think
it is already possible that we just skip a particular update/delete
(if the corresponding tuple doesn't exist on the subscriber). For the
sake of simplicity, we can just allow providing xid at this stage and
then extend it later as required but I am not very sure of that point.

> > I am not sure what exactly we can provide to users to allow skipping
> > initial table sync as we can't specify XID there. One option that
> > comes to mind is to allow specifying a combination of copy_data and
> > relid to skip table sync for a particular relation. We might think of
> > not doing anything for table sync workers but not sure if that is a
> > good option.
>
> I don't think this feature should affect tablesync.  The semantics are
> not clear, and it's not really needed.  If the tablesync doesn't work,
> you can try the setup again from scratch.
>

Okay, that makes sense. But note it is possible that tablesync workers
might also need to skip some xids during the catchup phase to complete
the sync.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

03 декабря 2021 г., 09:41:47

On Fri, Dec 3, 2021 at 11:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 2, 2021 at 8:38 PM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> >
> > On 02.12.21 07:48, Amit Kapila wrote:
> > > a. ALTER SUBSCRIPTION ... [SET|RESET] SKIP TRANSACTION xxx;
> > > b. Alter Subscription <sub_name> SET ( subscription_parameter [=value]
> > > [, ... ] );
> > > c. Alter Subscription <sub_name> On Error ( subscription_parameter
> > > [=value] [, ... ] );
> > > d. Alter Subscription <sub_name> SKIP ( subscription_parameter
> > > [=value] [, ... ] );
> > > where subscription_parameter can be one of:
> > > xid = <xid_val>
> > > lsn = <lsn_val>
> > > ...
> >
> > > As per discussion till now, option (d) seems preferable.
> >
> > I agree.

+1

> >
> > > In this, we
> > > need to see how and what to allow as options. The simplest way for the
> > > first version is to just allow one xid to be specified at a time which
> > > would mean that specifying multiple xids should error out. We can also
> > > additionally allow specifying operations like 'insert', 'update',
> > > etc., and then relation list (list of oids). What that would mean is
> > > that for a transaction we can allow which particular operations and
> > > relations we want to skip.
> >
> > I don't know how difficult it would be, but allowing multiple xids might
> > be desirable.
> >
>
> Are there many cases where there could be multiple xid failures that
> the user can skip? Apply worker always keeps looping at the same error
> failure so the user wouldn't know of the second xid failure (if any)
> till the first failure is resolved. I could think of one such case
> where it is possible during the initial synchronization phase where
> apply worker went ahead then tablesync worker by skipping to apply the
> changes on the corresponding table. After that, it is possible, that
> the table sync worker failed during the catch-up phase and apply
> worker fails during the processing of some other rel.
>
> >  But this syntax gives you flexibility, so we can also
> > start with a simple implementation.
> >
>
> Yeah, I also think so. BTW, what do you think of providing extra
> flexibility of giving other options like 'operation', 'rel' along with
> xid? I think such options could be useful for large transactions that
> operate on multiple tables as it is quite possible that only a
> particular operation from the entire transaction is the cause of
> failure. Now, on one side, we can argue that skipping the entire
> transaction is better from the consistency point of view but I think
> it is already possible that we just skip a particular update/delete
> (if the corresponding tuple doesn't exist on the subscriber). For the
> sake of simplicity, we can just allow providing xid at this stage and
> then extend it later as required but I am not very sure of that point.

+1

Skipping a whole transaction by specifying xid would be a good start.
Ideally, we'd like to automatically skip only operations within the
transaction that fail but it seems not easy to achieve. If we allow
specifying operations and/or relations, probably multiple operations
or relations need to be specified in some cases. Otherwise, the
subscriber cannot continue logical replication if the transaction has
multiple operations on different relations that fail. But similar to
the idea of specifying multiple xids, we need to note the fact that
user wouldn't know of the second operation failure unless the apply
worker applies the change. So I'm not sure there are many use cases in
practice where users can specify multiple operations and relations in
order to skip applies that fail.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

06 декабря 2021 г., 08:17:00

On Fri, Dec 3, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Dec 3, 2021 at 11:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > >  But this syntax gives you flexibility, so we can also
> > > start with a simple implementation.
> > >
> >
> > Yeah, I also think so. BTW, what do you think of providing extra
> > flexibility of giving other options like 'operation', 'rel' along with
> > xid? I think such options could be useful for large transactions that
> > operate on multiple tables as it is quite possible that only a
> > particular operation from the entire transaction is the cause of
> > failure. Now, on one side, we can argue that skipping the entire
> > transaction is better from the consistency point of view but I think
> > it is already possible that we just skip a particular update/delete
> > (if the corresponding tuple doesn't exist on the subscriber). For the
> > sake of simplicity, we can just allow providing xid at this stage and
> > then extend it later as required but I am not very sure of that point.
>
> +1
>
> Skipping a whole transaction by specifying xid would be a good start.
>

Okay, that sounds reasonable, so let's do that for now.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

07 декабря 2021 г., 14:36:10

On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 3, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Dec 3, 2021 at 11:53 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > >  But this syntax gives you flexibility, so we can also
> > > > start with a simple implementation.
> > > >
> > >
> > > Yeah, I also think so. BTW, what do you think of providing extra
> > > flexibility of giving other options like 'operation', 'rel' along with
> > > xid? I think such options could be useful for large transactions that
> > > operate on multiple tables as it is quite possible that only a
> > > particular operation from the entire transaction is the cause of
> > > failure. Now, on one side, we can argue that skipping the entire
> > > transaction is better from the consistency point of view but I think
> > > it is already possible that we just skip a particular update/delete
> > > (if the corresponding tuple doesn't exist on the subscriber). For the
> > > sake of simplicity, we can just allow providing xid at this stage and
> > > then extend it later as required but I am not very sure of that point.
> >
> > +1
> >
> > Skipping a whole transaction by specifying xid would be a good start.
> >
>
> Okay, that sounds reasonable, so let's do that for now.

I'll submit the patch tomorrow.

While updating the patch, I realized that skipping a transaction that
is prepared on the publisher will be tricky a bit;

First of all, since skip-xid is in pg_subscription catalog, we need to
do a catalog update in a transaction and commit it to disable it. I
think we need to set origin-lsn and timestamp of the transaction being
skipped to the transaction that does the catalog update. That is,
during skipping the (not prepared) transaction, we skip all
data-modification changes coming from the publisher, do a catalog
update, and commit the transaction. If we do the catalog update in the
next transaction after skipping the whole transaction, skip_xid could
be left in case of a server crash between them. Also, we cannot set
origin-lsn and timestamp to an empty transaction.

In prepared transaction cases, I think that when handling a prepare
message, we need to commit the transaction to update the catalog,
instead of preparing it. And at the commit prepared and rollback
prepared time, we skip it since there is not the prepared transaction
on the subscriber. Currently, handling rollback prepared already
behaves so; it first checks whether we have prepared the transaction
or not and skip it if haven’t. So I think we need to do that also for
commit prepared case. With that, this requires protocol changes so
that the subscriber can get prepare-lsn and prepare-time when handling
commit prepared.

So I’m writing a separate patch to add prepare-lsn and timestamp to
commit_prepared message, which will be a building block for skipping
prepared transactions. Actually, I think it’s beneficial even today;
we can skip preparing the transaction if it’s an empty transaction.
Although the comment it’s not a common case, I think that it could
happen quite often in some cases:

    * XXX, We can optimize such that at commit prepared time, we first check
    * whether we have prepared the transaction or not but that doesn't seem
    * worthwhile because such cases shouldn't be common.
    */

For example, if the publisher has multiple subscriptions and there are
many prepared transactions that modify the particular table subscribed
by one publisher, many empty transactions are replicated to other
subscribers.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

07 декабря 2021 г., 17:44:18

On 03.12.21 03:53, Amit Kapila wrote:
>> I don't know how difficult it would be, but allowing multiple xids might
>> be desirable.
> 
> Are there many cases where there could be multiple xid failures that
> the user can skip? Apply worker always keeps looping at the same error
> failure so the user wouldn't know of the second xid failure (if any)
> till the first failure is resolved.

Yeah, nevermind, doesn't make sense.

> Yeah, I also think so. BTW, what do you think of providing extra
> flexibility of giving other options like 'operation', 'rel' along with
> xid? I think such options could be useful for large transactions that
> operate on multiple tables as it is quite possible that only a
> particular operation from the entire transaction is the cause of
> failure. Now, on one side, we can argue that skipping the entire
> transaction is better from the consistency point of view but I think
> it is already possible that we just skip a particular update/delete
> (if the corresponding tuple doesn't exist on the subscriber). For the
> sake of simplicity, we can just allow providing xid at this stage and
> then extend it later as required but I am not very sure of that point.

Skipping transactions partially sounds dangerous, especially when 
exposed as an option to users.  Needs more careful thought.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

08 декабря 2021 г., 08:15:48

On Tue, Dec 7, 2021 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I'll submit the patch tomorrow.
>
> While updating the patch, I realized that skipping a transaction that
> is prepared on the publisher will be tricky a bit;
>
> First of all, since skip-xid is in pg_subscription catalog, we need to
> do a catalog update in a transaction and commit it to disable it. I
> think we need to set origin-lsn and timestamp of the transaction being
> skipped to the transaction that does the catalog update. That is,
> during skipping the (not prepared) transaction, we skip all
> data-modification changes coming from the publisher, do a catalog
> update, and commit the transaction. If we do the catalog update in the
> next transaction after skipping the whole transaction, skip_xid could
> be left in case of a server crash between them.
>

But if we haven't updated origin_lsn/timestamp before the crash, won't
it request the same transaction again from the publisher? If so, it
will be again able to skip it because skip_xid is still not updated.

> Also, we cannot set
> origin-lsn and timestamp to an empty transaction.
>

But won't we update the catalog for skip_xid in that case?

Do we see any advantage of updating the skip_xid in the same
transaction vs. doing it in a separate transaction? If not then
probably we can choose either of those ways and add some comments to
indicate the possibility of doing it another way.

> In prepared transaction cases, I think that when handling a prepare
> message, we need to commit the transaction to update the catalog,
> instead of preparing it. And at the commit prepared and rollback
> prepared time, we skip it since there is not the prepared transaction
> on the subscriber.
>

Can't we think of just allowing prepare in this case and updating the
skip_xid only at commit time? I see that in this case, we would be
doing prepare for a transaction that has no changes but as such cases
won't be common, isn't that acceptable?

> Currently, handling rollback prepared already
> behaves so; it first checks whether we have prepared the transaction
> or not and skip it if haven’t. So I think we need to do that also for
> commit prepared case. With that, this requires protocol changes so
> that the subscriber can get prepare-lsn and prepare-time when handling
> commit prepared.
>
> So I’m writing a separate patch to add prepare-lsn and timestamp to
> commit_prepared message, which will be a building block for skipping
> prepared transactions. Actually, I think it’s beneficial even today;
> we can skip preparing the transaction if it’s an empty transaction.
> Although the comment it’s not a common case, I think that it could
> happen quite often in some cases:
>
>     * XXX, We can optimize such that at commit prepared time, we first check
>     * whether we have prepared the transaction or not but that doesn't seem
>     * worthwhile because such cases shouldn't be common.
>     */
>
> For example, if the publisher has multiple subscriptions and there are
> many prepared transactions that modify the particular table subscribed
> by one publisher, many empty transactions are replicated to other
> subscribers.
>

I think this is not clear to me. Why would one have multiple
subscriptions for the same publication? I thought it is possible when
say some publisher doesn't publish any data of prepared transaction
say because the corresponding action is not published or something
like that. I don't deny that someday we want to optimize this case but
it might be better if we don't need to do it along with this patch.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

08 декабря 2021 г., 09:17:52

On Wed, Dec 8, 2021 at 2:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 7, 2021 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I'll submit the patch tomorrow.
> >
> > While updating the patch, I realized that skipping a transaction that
> > is prepared on the publisher will be tricky a bit;
> >
> > First of all, since skip-xid is in pg_subscription catalog, we need to
> > do a catalog update in a transaction and commit it to disable it. I
> > think we need to set origin-lsn and timestamp of the transaction being
> > skipped to the transaction that does the catalog update. That is,
> > during skipping the (not prepared) transaction, we skip all
> > data-modification changes coming from the publisher, do a catalog
> > update, and commit the transaction. If we do the catalog update in the
> > next transaction after skipping the whole transaction, skip_xid could
> > be left in case of a server crash between them.
> >
>
> But if we haven't updated origin_lsn/timestamp before the crash, won't
> it request the same transaction again from the publisher? If so, it
> will be again able to skip it because skip_xid is still not updated.

Yes. I mean that if we update origin_lsn and origin_timestamp when
committing the skipped transaction and then update the catalog in the
next transaction it doesn't work in case of a crash. But it's not
possible in the first place since the first transaction is empty and
we cannot set origin_lsn and origin_timestamp to it.

>
> > Also, we cannot set
> > origin-lsn and timestamp to an empty transaction.
> >
>
> But won't we update the catalog for skip_xid in that case?

Yes. Probably my explanation was not clear. Even if we skip all
changes of the transaction, the transaction doesn't become empty since
we update the catalog.

>
> Do we see any advantage of updating the skip_xid in the same
> transaction vs. doing it in a separate transaction? If not then
> probably we can choose either of those ways and add some comments to
> indicate the possibility of doing it another way.

I think that since the skipped transaction is always empty there is
always one transaction. What we need to consider is when we update
origin_lsn and origin_timestamp. In non-prepared transaction cases,
the only option is when updating the catalog.

>
> > In prepared transaction cases, I think that when handling a prepare
> > message, we need to commit the transaction to update the catalog,
> > instead of preparing it. And at the commit prepared and rollback
> > prepared time, we skip it since there is not the prepared transaction
> > on the subscriber.
> >
>
> Can't we think of just allowing prepare in this case and updating the
> skip_xid only at commit time? I see that in this case, we would be
> doing prepare for a transaction that has no changes but as such cases
> won't be common, isn't that acceptable?

In this case, we will end up committing both the prepared (empty)
transaction and the transaction that updates the catalog, right? If
so, since these are separate transactions it can be a problem in case
of a crash between these two commits.

>
> > Currently, handling rollback prepared already
> > behaves so; it first checks whether we have prepared the transaction
> > or not and skip it if haven’t. So I think we need to do that also for
> > commit prepared case. With that, this requires protocol changes so
> > that the subscriber can get prepare-lsn and prepare-time when handling
> > commit prepared.
> >
> > So I’m writing a separate patch to add prepare-lsn and timestamp to
> > commit_prepared message, which will be a building block for skipping
> > prepared transactions. Actually, I think it’s beneficial even today;
> > we can skip preparing the transaction if it’s an empty transaction.
> > Although the comment it’s not a common case, I think that it could
> > happen quite often in some cases:
> >
> >     * XXX, We can optimize such that at commit prepared time, we first check
> >     * whether we have prepared the transaction or not but that doesn't seem
> >     * worthwhile because such cases shouldn't be common.
> >     */
> >
> > For example, if the publisher has multiple subscriptions and there are
> > many prepared transactions that modify the particular table subscribed
> > by one publisher, many empty transactions are replicated to other
> > subscribers.
> >
>
> I think this is not clear to me. Why would one have multiple
> subscriptions for the same publication? I thought it is possible when
> say some publisher doesn't publish any data of prepared transaction
> say because the corresponding action is not published or something
> like that. I don't deny that someday we want to optimize this case but
> it might be better if we don't need to do it along with this patch.

I imagined that the publisher has two publications (say pub-A and
pub-B) that publishes a diferent set of relations in the database and
there are two subscribers that are subscribing to either one
publication (e.g, subscriber-A subscribes to pub-A and subscriber-B
subscribes to pub-B). If many prepared transactions happen on the
publisher and these transactions modify only relations published by
pub-A,  both subscriber-A and subscriber-B would prepare the same
number of transactions but all of them in subscriber-B is empty.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

08 декабря 2021 г., 09:50:47

On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 2:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Dec 7, 2021 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I'll submit the patch tomorrow.
> > >
> > > While updating the patch, I realized that skipping a transaction that
> > > is prepared on the publisher will be tricky a bit;
> > >
> > > First of all, since skip-xid is in pg_subscription catalog, we need to
> > > do a catalog update in a transaction and commit it to disable it. I
> > > think we need to set origin-lsn and timestamp of the transaction being
> > > skipped to the transaction that does the catalog update. That is,
> > > during skipping the (not prepared) transaction, we skip all
> > > data-modification changes coming from the publisher, do a catalog
> > > update, and commit the transaction. If we do the catalog update in the
> > > next transaction after skipping the whole transaction, skip_xid could
> > > be left in case of a server crash between them.
> > >
> >
> > But if we haven't updated origin_lsn/timestamp before the crash, won't
> > it request the same transaction again from the publisher? If so, it
> > will be again able to skip it because skip_xid is still not updated.
>
> Yes. I mean that if we update origin_lsn and origin_timestamp when
> committing the skipped transaction and then update the catalog in the
> next transaction it doesn't work in case of a crash. But it's not
> possible in the first place since the first transaction is empty and
> we cannot set origin_lsn and origin_timestamp to it.
>
> >
> > > Also, we cannot set
> > > origin-lsn and timestamp to an empty transaction.
> > >
> >
> > But won't we update the catalog for skip_xid in that case?
>
> Yes. Probably my explanation was not clear. Even if we skip all
> changes of the transaction, the transaction doesn't become empty since
> we update the catalog.
>
> >
> > Do we see any advantage of updating the skip_xid in the same
> > transaction vs. doing it in a separate transaction? If not then
> > probably we can choose either of those ways and add some comments to
> > indicate the possibility of doing it another way.
>
> I think that since the skipped transaction is always empty there is
> always one transaction. What we need to consider is when we update
> origin_lsn and origin_timestamp. In non-prepared transaction cases,
> the only option is when updating the catalog.
>

Your last sentence is not completely clear to me but it seems you
agree that we can use one transaction instead of two to skip the
changes, perform a catalog update, and update origin_lsn/timestamp.

> >
> > > In prepared transaction cases, I think that when handling a prepare
> > > message, we need to commit the transaction to update the catalog,
> > > instead of preparing it. And at the commit prepared and rollback
> > > prepared time, we skip it since there is not the prepared transaction
> > > on the subscriber.
> > >
> >
> > Can't we think of just allowing prepare in this case and updating the
> > skip_xid only at commit time? I see that in this case, we would be
> > doing prepare for a transaction that has no changes but as such cases
> > won't be common, isn't that acceptable?
>
> In this case, we will end up committing both the prepared (empty)
> transaction and the transaction that updates the catalog, right?
>

Can't we do this catalog update before committing the prepared
transaction? If so, both in prepared and non-prepared cases, our
implementation could be the same and we have a reason to accomplish
the catalog update in the same transaction for which we skipped the
changes.

> If
> so, since these are separate transactions it can be a problem in case
> of a crash between these two commits.
>
> >
> > > Currently, handling rollback prepared already
> > > behaves so; it first checks whether we have prepared the transaction
> > > or not and skip it if haven’t. So I think we need to do that also for
> > > commit prepared case. With that, this requires protocol changes so
> > > that the subscriber can get prepare-lsn and prepare-time when handling
> > > commit prepared.
> > >
> > > So I’m writing a separate patch to add prepare-lsn and timestamp to
> > > commit_prepared message, which will be a building block for skipping
> > > prepared transactions. Actually, I think it’s beneficial even today;
> > > we can skip preparing the transaction if it’s an empty transaction.
> > > Although the comment it’s not a common case, I think that it could
> > > happen quite often in some cases:
> > >
> > >     * XXX, We can optimize such that at commit prepared time, we first check
> > >     * whether we have prepared the transaction or not but that doesn't seem
> > >     * worthwhile because such cases shouldn't be common.
> > >     */
> > >
> > > For example, if the publisher has multiple subscriptions and there are
> > > many prepared transactions that modify the particular table subscribed
> > > by one publisher, many empty transactions are replicated to other
> > > subscribers.
> > >
> >
> > I think this is not clear to me. Why would one have multiple
> > subscriptions for the same publication? I thought it is possible when
> > say some publisher doesn't publish any data of prepared transaction
> > say because the corresponding action is not published or something
> > like that. I don't deny that someday we want to optimize this case but
> > it might be better if we don't need to do it along with this patch.
>
> I imagined that the publisher has two publications (say pub-A and
> pub-B) that publishes a diferent set of relations in the database and
> there are two subscribers that are subscribing to either one
> publication (e.g, subscriber-A subscribes to pub-A and subscriber-B
> subscribes to pub-B). If many prepared transactions happen on the
> publisher and these transactions modify only relations published by
> pub-A,  both subscriber-A and subscriber-B would prepare the same
> number of transactions but all of them in subscriber-B is empty.
>

Okay, I understand those cases but note always checking if the
prepared xact exists during commit prepared has a cost and that is why
we avoided it at the first place. There is a separate effort in
progress [1] where we want to avoid sending empty transactions at the
first place. So, it is better to avoid this cost via that effort
rather than adding additional cost at commit of each prepared
transaction. OTOH, if there are other strong reasons to do it then we
can probably consider it.

[1] - https://commitfest.postgresql.org/36/3093/

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

08 декабря 2021 г., 10:05:49

On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Dec 8, 2021 at 2:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Dec 7, 2021 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 6, 2021 at 2:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > I'll submit the patch tomorrow.
> > > >
> > > > While updating the patch, I realized that skipping a transaction that
> > > > is prepared on the publisher will be tricky a bit;
> > > >
> > > > First of all, since skip-xid is in pg_subscription catalog, we need to
> > > > do a catalog update in a transaction and commit it to disable it. I
> > > > think we need to set origin-lsn and timestamp of the transaction being
> > > > skipped to the transaction that does the catalog update. That is,
> > > > during skipping the (not prepared) transaction, we skip all
> > > > data-modification changes coming from the publisher, do a catalog
> > > > update, and commit the transaction. If we do the catalog update in the
> > > > next transaction after skipping the whole transaction, skip_xid could
> > > > be left in case of a server crash between them.
> > > >
> > >
> > > But if we haven't updated origin_lsn/timestamp before the crash, won't
> > > it request the same transaction again from the publisher? If so, it
> > > will be again able to skip it because skip_xid is still not updated.
> >
> > Yes. I mean that if we update origin_lsn and origin_timestamp when
> > committing the skipped transaction and then update the catalog in the
> > next transaction it doesn't work in case of a crash. But it's not
> > possible in the first place since the first transaction is empty and
> > we cannot set origin_lsn and origin_timestamp to it.
> >
> > >
> > > > Also, we cannot set
> > > > origin-lsn and timestamp to an empty transaction.
> > > >
> > >
> > > But won't we update the catalog for skip_xid in that case?
> >
> > Yes. Probably my explanation was not clear. Even if we skip all
> > changes of the transaction, the transaction doesn't become empty since
> > we update the catalog.
> >
> > >
> > > Do we see any advantage of updating the skip_xid in the same
> > > transaction vs. doing it in a separate transaction? If not then
> > > probably we can choose either of those ways and add some comments to
> > > indicate the possibility of doing it another way.
> >
> > I think that since the skipped transaction is always empty there is
> > always one transaction. What we need to consider is when we update
> > origin_lsn and origin_timestamp. In non-prepared transaction cases,
> > the only option is when updating the catalog.
> >
>
> Your last sentence is not completely clear to me but it seems you
> agree that we can use one transaction instead of two to skip the
> changes, perform a catalog update, and update origin_lsn/timestamp.

Yes.

>
> > >
> > > > In prepared transaction cases, I think that when handling a prepare
> > > > message, we need to commit the transaction to update the catalog,
> > > > instead of preparing it. And at the commit prepared and rollback
> > > > prepared time, we skip it since there is not the prepared transaction
> > > > on the subscriber.
> > > >
> > >
> > > Can't we think of just allowing prepare in this case and updating the
> > > skip_xid only at commit time? I see that in this case, we would be
> > > doing prepare for a transaction that has no changes but as such cases
> > > won't be common, isn't that acceptable?
> >
> > In this case, we will end up committing both the prepared (empty)
> > transaction and the transaction that updates the catalog, right?
> >
>
> Can't we do this catalog update before committing the prepared
> transaction? If so, both in prepared and non-prepared cases, our
> implementation could be the same and we have a reason to accomplish
> the catalog update in the same transaction for which we skipped the
> changes.

But in case of a crash between these two transactions, given that
skip_xid is already cleared how do we know the prepared transaction
that was supposed to be skipped?

>
> > If
> > so, since these are separate transactions it can be a problem in case
> > of a crash between these two commits.
> >
> > >
> > > > Currently, handling rollback prepared already
> > > > behaves so; it first checks whether we have prepared the transaction
> > > > or not and skip it if haven’t. So I think we need to do that also for
> > > > commit prepared case. With that, this requires protocol changes so
> > > > that the subscriber can get prepare-lsn and prepare-time when handling
> > > > commit prepared.
> > > >
> > > > So I’m writing a separate patch to add prepare-lsn and timestamp to
> > > > commit_prepared message, which will be a building block for skipping
> > > > prepared transactions. Actually, I think it’s beneficial even today;
> > > > we can skip preparing the transaction if it’s an empty transaction.
> > > > Although the comment it’s not a common case, I think that it could
> > > > happen quite often in some cases:
> > > >
> > > >     * XXX, We can optimize such that at commit prepared time, we first check
> > > >     * whether we have prepared the transaction or not but that doesn't seem
> > > >     * worthwhile because such cases shouldn't be common.
> > > >     */
> > > >
> > > > For example, if the publisher has multiple subscriptions and there are
> > > > many prepared transactions that modify the particular table subscribed
> > > > by one publisher, many empty transactions are replicated to other
> > > > subscribers.
> > > >
> > >
> > > I think this is not clear to me. Why would one have multiple
> > > subscriptions for the same publication? I thought it is possible when
> > > say some publisher doesn't publish any data of prepared transaction
> > > say because the corresponding action is not published or something
> > > like that. I don't deny that someday we want to optimize this case but
> > > it might be better if we don't need to do it along with this patch.
> >
> > I imagined that the publisher has two publications (say pub-A and
> > pub-B) that publishes a diferent set of relations in the database and
> > there are two subscribers that are subscribing to either one
> > publication (e.g, subscriber-A subscribes to pub-A and subscriber-B
> > subscribes to pub-B). If many prepared transactions happen on the
> > publisher and these transactions modify only relations published by
> > pub-A,  both subscriber-A and subscriber-B would prepare the same
> > number of transactions but all of them in subscriber-B is empty.
> >
>
> Okay, I understand those cases but note always checking if the
> prepared xact exists during commit prepared has a cost and that is why
> we avoided it at the first place. There is a separate effort in
> progress [1] where we want to avoid sending empty transactions at the
> first place. So, it is better to avoid this cost via that effort
> rather than adding additional cost at commit of each prepared
> transaction. OTOH, if there are other strong reasons to do it then we
> can probably consider it.
>

Thank you for the information. Agreed.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

08 декабря 2021 г., 11:54:37

On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > >
> > > > Can't we think of just allowing prepare in this case and updating the
> > > > skip_xid only at commit time? I see that in this case, we would be
> > > > doing prepare for a transaction that has no changes but as such cases
> > > > won't be common, isn't that acceptable?
> > >
> > > In this case, we will end up committing both the prepared (empty)
> > > transaction and the transaction that updates the catalog, right?
> > >
> >
> > Can't we do this catalog update before committing the prepared
> > transaction? If so, both in prepared and non-prepared cases, our
> > implementation could be the same and we have a reason to accomplish
> > the catalog update in the same transaction for which we skipped the
> > changes.
>
> But in case of a crash between these two transactions, given that
> skip_xid is already cleared how do we know the prepared transaction
> that was supposed to be skipped?
>

I was thinking of doing it as one transaction at the time of
commit_prepare. Say, in function apply_handle_commit_prepared(), if we
check whether the skip_xid is the same as prepare_data.xid then update
the catalog and set origin_lsn/timestamp in the same transaction. Why
do we need two transactions for it?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

08 декабря 2021 г., 14:06:01

On Wed, Dec 8, 2021 at 5:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > >
> > > > > Can't we think of just allowing prepare in this case and updating the
> > > > > skip_xid only at commit time? I see that in this case, we would be
> > > > > doing prepare for a transaction that has no changes but as such cases
> > > > > won't be common, isn't that acceptable?
> > > >
> > > > In this case, we will end up committing both the prepared (empty)
> > > > transaction and the transaction that updates the catalog, right?
> > > >
> > >
> > > Can't we do this catalog update before committing the prepared
> > > transaction? If so, both in prepared and non-prepared cases, our
> > > implementation could be the same and we have a reason to accomplish
> > > the catalog update in the same transaction for which we skipped the
> > > changes.
> >
> > But in case of a crash between these two transactions, given that
> > skip_xid is already cleared how do we know the prepared transaction
> > that was supposed to be skipped?
> >
>
> I was thinking of doing it as one transaction at the time of
> commit_prepare. Say, in function apply_handle_commit_prepared(), if we
> check whether the skip_xid is the same as prepare_data.xid then update
> the catalog and set origin_lsn/timestamp in the same transaction. Why
> do we need two transactions for it?

I meant the two transactions are the prepared transaction and the
transaction that updates the catalog. If I understand your idea
correctly, in apply_handle_commit_prepared(), we update the catalog
and set origin_lsn/timestamp. These are done in the same transaction.
Then, we commit the prepared transaction, right? If the server crashes
between them, skip_xid is already cleared and logical replication
starts from the LSN after COMMIT PREPARED. But the prepared
transaction still exists on the subscriber.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

08 декабря 2021 г., 14:22:32

On Wed, Dec 8, 2021 at 4:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Okay, I understand those cases but note always checking if the
> > prepared xact exists during commit prepared has a cost and that is why
> > we avoided it at the first place.

BTW what costs were we concerned about? Looking at LookupGXact(), we
look for the 2PC state data on shmem while acquiring TwoPhaseStateLock
in shared mode. And we check origin_lsn and origin_timestamp of 2PC by
reading WAL or 2PC state file only if gid matched. On the other hand,
committing the prepared transaction does WAL logging, waits for
synchronous replication, and calls post-commit callbacks, and removes
2PC state file etc. And it requires acquiring TwoPhaseStateLock in
exclusive mode to remove 2PC state entry. So it looks like always
checking if the prepared transaction exists and skipping it if not is
cheaper than always committing prepared transactions.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

09 декабря 2021 г., 05:47:27

On Wed, Dec 8, 2021 at 4:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 5:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > >
> > > > > > Can't we think of just allowing prepare in this case and updating the
> > > > > > skip_xid only at commit time? I see that in this case, we would be
> > > > > > doing prepare for a transaction that has no changes but as such cases
> > > > > > won't be common, isn't that acceptable?
> > > > >
> > > > > In this case, we will end up committing both the prepared (empty)
> > > > > transaction and the transaction that updates the catalog, right?
> > > > >
> > > >
> > > > Can't we do this catalog update before committing the prepared
> > > > transaction? If so, both in prepared and non-prepared cases, our
> > > > implementation could be the same and we have a reason to accomplish
> > > > the catalog update in the same transaction for which we skipped the
> > > > changes.
> > >
> > > But in case of a crash between these two transactions, given that
> > > skip_xid is already cleared how do we know the prepared transaction
> > > that was supposed to be skipped?
> > >
> >
> > I was thinking of doing it as one transaction at the time of
> > commit_prepare. Say, in function apply_handle_commit_prepared(), if we
> > check whether the skip_xid is the same as prepare_data.xid then update
> > the catalog and set origin_lsn/timestamp in the same transaction. Why
> > do we need two transactions for it?
>
> I meant the two transactions are the prepared transaction and the
> transaction that updates the catalog. If I understand your idea
> correctly, in apply_handle_commit_prepared(), we update the catalog
> and set origin_lsn/timestamp. These are done in the same transaction.
> Then, we commit the prepared transaction, right?
>

I am thinking that we can start a transaction, update the catalog,
commit that transaction. Then start a new one to update
origin_lsn/timestamp, finishprepared, and commit it. Now, if it
crashes after the first transaction, only commit prepared will be
resent again and this time we don't need to update the catalog as that
entry would be already cleared.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

09 декабря 2021 г., 11:53:49

On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 8, 2021 at 4:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Dec 8, 2021 at 5:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Dec 8, 2021 at 12:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Dec 8, 2021 at 3:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Wed, Dec 8, 2021 at 11:48 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > >
> > > > > > > Can't we think of just allowing prepare in this case and updating the
> > > > > > > skip_xid only at commit time? I see that in this case, we would be
> > > > > > > doing prepare for a transaction that has no changes but as such cases
> > > > > > > won't be common, isn't that acceptable?
> > > > > >
> > > > > > In this case, we will end up committing both the prepared (empty)
> > > > > > transaction and the transaction that updates the catalog, right?
> > > > > >
> > > > >
> > > > > Can't we do this catalog update before committing the prepared
> > > > > transaction? If so, both in prepared and non-prepared cases, our
> > > > > implementation could be the same and we have a reason to accomplish
> > > > > the catalog update in the same transaction for which we skipped the
> > > > > changes.
> > > >
> > > > But in case of a crash between these two transactions, given that
> > > > skip_xid is already cleared how do we know the prepared transaction
> > > > that was supposed to be skipped?
> > > >
> > >
> > > I was thinking of doing it as one transaction at the time of
> > > commit_prepare. Say, in function apply_handle_commit_prepared(), if we
> > > check whether the skip_xid is the same as prepare_data.xid then update
> > > the catalog and set origin_lsn/timestamp in the same transaction. Why
> > > do we need two transactions for it?
> >
> > I meant the two transactions are the prepared transaction and the
> > transaction that updates the catalog. If I understand your idea
> > correctly, in apply_handle_commit_prepared(), we update the catalog
> > and set origin_lsn/timestamp. These are done in the same transaction.
> > Then, we commit the prepared transaction, right?
> >
>
> I am thinking that we can start a transaction, update the catalog,
> commit that transaction. Then start a new one to update
> origin_lsn/timestamp, finishprepared, and commit it. Now, if it
> crashes after the first transaction, only commit prepared will be
> resent again and this time we don't need to update the catalog as that
> entry would be already cleared.

Sounds good. In the crash case, it should be fine since we will just
commit an empty transaction. The same is true for the case where
skip_xid has been changed after skipping and preparing the transaction
and before handling commit_prepared.

Regarding the case where the user specifies XID of the transaction
after it is prepared on the subscriber (i.g., the transaction is not
empty), we won’t skip committing the prepared transaction. But I think
that we don't need to support skipping already-prepared transaction
since such transaction doesn't conflict with anything regardless of
having changed or not.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

09 декабря 2021 г., 12:16:13

On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I am thinking that we can start a transaction, update the catalog,
> > commit that transaction. Then start a new one to update
> > origin_lsn/timestamp, finishprepared, and commit it. Now, if it
> > crashes after the first transaction, only commit prepared will be
> > resent again and this time we don't need to update the catalog as that
> > entry would be already cleared.
>
> Sounds good. In the crash case, it should be fine since we will just
> commit an empty transaction. The same is true for the case where
> skip_xid has been changed after skipping and preparing the transaction
> and before handling commit_prepared.
>
> Regarding the case where the user specifies XID of the transaction
> after it is prepared on the subscriber (i.g., the transaction is not
> empty), we won’t skip committing the prepared transaction. But I think
> that we don't need to support skipping already-prepared transaction
> since such transaction doesn't conflict with anything regardless of
> having changed or not.
>

Yeah, this makes sense to me.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

10 декабря 2021 г., 08:44:16

On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I am thinking that we can start a transaction, update the catalog,
> > > commit that transaction. Then start a new one to update
> > > origin_lsn/timestamp, finishprepared, and commit it. Now, if it
> > > crashes after the first transaction, only commit prepared will be
> > > resent again and this time we don't need to update the catalog as that
> > > entry would be already cleared.
> >
> > Sounds good. In the crash case, it should be fine since we will just
> > commit an empty transaction. The same is true for the case where
> > skip_xid has been changed after skipping and preparing the transaction
> > and before handling commit_prepared.
> >
> > Regarding the case where the user specifies XID of the transaction
> > after it is prepared on the subscriber (i.g., the transaction is not
> > empty), we won’t skip committing the prepared transaction. But I think
> > that we don't need to support skipping already-prepared transaction
> > since such transaction doesn't conflict with anything regardless of
> > having changed or not.
> >
>
> Yeah, this makes sense to me.
>

I've attached an updated patch. The new syntax is like "ALTER
SUBSCRIPTION testsub SKIP (xid = '123')".

I’ve been thinking we can do something safeguard for the case where
the user specified the wrong xid. For example, can we somewhat use the
stats in pg_stat_subscription_workers? An idea is that logical
replication worker fetches the xid from the stats when reading the
subscription and skips the transaction if the xid matches to
subskipxid. That is, the worker checks the error reported by the
worker previously working on the same subscription. The error could
not be a conflict error (e.g., connection error etc.) or might have
been cleared by the reset function, But given the worker is in an
error loop, the worker can eventually get xid in question. We can
prevent an unrelated transaction from being skipped unexpectedly. It
seems not a stable solution though. Or it might be enough to warn
users when they specified an XID that doesn’t match to last_error_xid.
Anyway, I think it’s better to have more discussion on this. Any
ideas?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transactio.patch

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

11 декабря 2021 г., 09:29:18

On Fri, Dec 10, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > I am thinking that we can start a transaction, update the catalog,
> > > > commit that transaction. Then start a new one to update
> > > > origin_lsn/timestamp, finishprepared, and commit it. Now, if it
> > > > crashes after the first transaction, only commit prepared will be
> > > > resent again and this time we don't need to update the catalog as that
> > > > entry would be already cleared.
> > >
> > > Sounds good. In the crash case, it should be fine since we will just
> > > commit an empty transaction. The same is true for the case where
> > > skip_xid has been changed after skipping and preparing the transaction
> > > and before handling commit_prepared.
> > >
> > > Regarding the case where the user specifies XID of the transaction
> > > after it is prepared on the subscriber (i.g., the transaction is not
> > > empty), we won’t skip committing the prepared transaction. But I think
> > > that we don't need to support skipping already-prepared transaction
> > > since such transaction doesn't conflict with anything regardless of
> > > having changed or not.
> > >
> >
> > Yeah, this makes sense to me.
> >
>
> I've attached an updated patch. The new syntax is like "ALTER
> SUBSCRIPTION testsub SKIP (xid = '123')".
>
> I’ve been thinking we can do something safeguard for the case where
> the user specified the wrong xid. For example, can we somewhat use the
> stats in pg_stat_subscription_workers? An idea is that logical
> replication worker fetches the xid from the stats when reading the
> subscription and skips the transaction if the xid matches to
> subskipxid. That is, the worker checks the error reported by the
> worker previously working on the same subscription. The error could
> not be a conflict error (e.g., connection error etc.) or might have
> been cleared by the reset function, But given the worker is in an
> error loop, the worker can eventually get xid in question. We can
> prevent an unrelated transaction from being skipped unexpectedly. It
> seems not a stable solution though. Or it might be enough to warn
> users when they specified an XID that doesn’t match to last_error_xid.
>

I think the idea is good but because it is not predictable as pointed
by you so we might want to just issue a LOG/WARNING. If not already
mentioned, then please do mention in docs the possibility of skipping
non-errored transactions.

Few comments/questions:
=====================
1.
+          Specifies the ID of the transaction whose application is to
be skipped
+          by the logical replication worker. Setting -1 means to reset the
+          transaction ID.

Can we change it to something like: "Specifies the ID of the
transaction whose changes are to be skipped by the logical replication
worker. ...."

2.
@@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
  Assert(!isnull);
  sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));

+ /* Get skip XID */
+ datum = SysCacheGetAttr(SUBSCRIPTIONOID,
+ tup,
+ Anum_pg_subscription_subskipxid,
+ &isnull);
+ if (!isnull)
+ sub->skipxid = DatumGetTransactionId(datum);
+ else
+ sub->skipxid = InvalidTransactionId;

Can't we assign it as we do for other fixed columns like subdbid,
subowner, etc.?

3.
+ * Also, we don't skip receiving the changes in streaming cases,
since we decide
+ * whether or not to skip applying the changes when starting to apply changes.

But why so? Can't we even skip streaming (and writing to file all such
messages)? If we can do this then we can avoid even collecting all
messages in a file.

4.
+ * Also, one might think that we can skip preparing the skipped transaction.
+ * But if we do that, PREPARE WAL record won’t be sent to its physical
+ * standbys, resulting in that users won’t be able to find the prepared
+ * transaction entry after a fail-over.
+ *
..
+ */
+ if (skipping_changes)
+ stop_skipping_changes(false);

Why do we need such a Prepare's entry either at current subscriber or
on its physical standby? I think it is to allow Commit-prepared. If
so, how about if we skip even commit prepared as well? Even on
physical standby, we would be having the value of skip_xid which can
help us to skip there as well after failover.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

13 декабря 2021 г., 05:58:08

On Sat, Dec 11, 2021 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 10, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > I am thinking that we can start a transaction, update the catalog,
> > > > > commit that transaction. Then start a new one to update
> > > > > origin_lsn/timestamp, finishprepared, and commit it. Now, if it
> > > > > crashes after the first transaction, only commit prepared will be
> > > > > resent again and this time we don't need to update the catalog as that
> > > > > entry would be already cleared.
> > > >
> > > > Sounds good. In the crash case, it should be fine since we will just
> > > > commit an empty transaction. The same is true for the case where
> > > > skip_xid has been changed after skipping and preparing the transaction
> > > > and before handling commit_prepared.
> > > >
> > > > Regarding the case where the user specifies XID of the transaction
> > > > after it is prepared on the subscriber (i.g., the transaction is not
> > > > empty), we won’t skip committing the prepared transaction. But I think
> > > > that we don't need to support skipping already-prepared transaction
> > > > since such transaction doesn't conflict with anything regardless of
> > > > having changed or not.
> > > >
> > >
> > > Yeah, this makes sense to me.
> > >
> >
> > I've attached an updated patch. The new syntax is like "ALTER
> > SUBSCRIPTION testsub SKIP (xid = '123')".
> >
> > I’ve been thinking we can do something safeguard for the case where
> > the user specified the wrong xid. For example, can we somewhat use the
> > stats in pg_stat_subscription_workers? An idea is that logical
> > replication worker fetches the xid from the stats when reading the
> > subscription and skips the transaction if the xid matches to
> > subskipxid. That is, the worker checks the error reported by the
> > worker previously working on the same subscription. The error could
> > not be a conflict error (e.g., connection error etc.) or might have
> > been cleared by the reset function, But given the worker is in an
> > error loop, the worker can eventually get xid in question. We can
> > prevent an unrelated transaction from being skipped unexpectedly. It
> > seems not a stable solution though. Or it might be enough to warn
> > users when they specified an XID that doesn’t match to last_error_xid.
> >
>
> I think the idea is good but because it is not predictable as pointed
> by you so we might want to just issue a LOG/WARNING. If not already
> mentioned, then please do mention in docs the possibility of skipping
> non-errored transactions.
>
> Few comments/questions:
> =====================
> 1.
> +          Specifies the ID of the transaction whose application is to
> be skipped
> +          by the logical replication worker. Setting -1 means to reset the
> +          transaction ID.
>
> Can we change it to something like: "Specifies the ID of the
> transaction whose changes are to be skipped by the logical replication
> worker. ...."
>

Agreed.

> 2.
> @@ -104,6 +104,16 @@ GetSubscription(Oid subid, bool missing_ok)
>   Assert(!isnull);
>   sub->publications = textarray_to_stringlist(DatumGetArrayTypeP(datum));
>
> + /* Get skip XID */
> + datum = SysCacheGetAttr(SUBSCRIPTIONOID,
> + tup,
> + Anum_pg_subscription_subskipxid,
> + &isnull);
> + if (!isnull)
> + sub->skipxid = DatumGetTransactionId(datum);
> + else
> + sub->skipxid = InvalidTransactionId;
>
> Can't we assign it as we do for other fixed columns like subdbid,
> subowner, etc.?
>

Yeah, I think we can use InvalidTransactionId as the initial value
instead of setting NULL. Then, we can change this code.

> 3.
> + * Also, we don't skip receiving the changes in streaming cases,
> since we decide
> + * whether or not to skip applying the changes when starting to apply changes.
>
> But why so? Can't we even skip streaming (and writing to file all such
> messages)? If we can do this then we can avoid even collecting all
> messages in a file.

IIUC in streaming cases, a transaction can be sent to the subscriber
while splitting into multiple chunks of changes. In the meanwhile,
skip_xid can be changed. If the user changed or cleared skip_xid after
the subscriber skips some streamed changes, the subscriber won't able
to have complete changes of the transaction.

>
> 4.
> + * Also, one might think that we can skip preparing the skipped transaction.
> + * But if we do that, PREPARE WAL record won’t be sent to its physical
> + * standbys, resulting in that users won’t be able to find the prepared
> + * transaction entry after a fail-over.
> + *
> ..
> + */
> + if (skipping_changes)
> + stop_skipping_changes(false);
>
> Why do we need such a Prepare's entry either at current subscriber or
> on its physical standby? I think it is to allow Commit-prepared. If
> so, how about if we skip even commit prepared as well? Even on
> physical standby, we would be having the value of skip_xid which can
> help us to skip there as well after failover.

It's true that skip_xid would be set also on physical standby. When it
comes to preparing the skipped transaction on the current subscriber,
if we want to skip commit-prepared I think we need protocol changes in
order for subscribers to know prepare_lsn and preppare_timestampso
that it can lookup the prepared transaction when doing
commit-prepared. I proposed this idea before. This change would be
benefical as of now since the publisher sends even empty transactions.
But considering the proposed patch[1] that makes the puslisher not
send empty transaction, this protocol change would be an optimization
only for this feature.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

13 декабря 2021 г., 06:12:10

On Fri, Dec 10, 2021 at 4:44 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached an updated patch. The new syntax is like "ALTER
> SUBSCRIPTION testsub SKIP (xid = '123')".
>

I have some review comments:

(1) Patch comment - some suggested wording improvements

BEFORE:
If incoming change violates any constraint, logical replication stops
AFTER:
If an incoming change violates any constraint, logical replication stops

BEFORE:
The user can specify XID by ALTER SUBSCRIPTION ... SKIP (xid = XXX),
updating pg_subscription.subskipxid field, telling the apply worker to
skip the transaction.
AFTER:
The user can specify the XID of the transaction to skip using
ALTER SUBSCRIPTION ... SKIP (xid = XXX), updating the pg_subscription.subskipxid
field, telling the apply worker to skip the transaction.

src/sgml/logical-replication.sgml
(2) Some suggested wording improvements

(i) Missing "the"
BEFORE:
+   the existing data.  When a conflict produce an error, it is shown in
AFTER:
+   the existing data.  When a conflict produce an error, it is shown in the

(ii) Suggest starting a new sentence
BEFORE:
+   and it is also shown in subscriber's server log as follows:
AFTER:
+   The error is also shown in the subscriber's server log as follows:


(iii) Context message should say "at ..." instead of "with commit
timestamp ...", to match the actual output from the current code
BEFORE:
+CONTEXT:  processing remote data during "INSERT" for replication
target relation "public.test" in transaction 716 with commit timestamp
2021-09-29 15:52:45.165754+00
AFTER:
+CONTEXT:  processing remote data during "INSERT" for replication
target relation "public.test" in transaction 716 at 2021-09-29
15:52:45.165754+00


(iv) The following paragraph seems out of place, with the information
presented in the wrong order:

+  <para>
+   In this case, you need to consider changing the data on the
subscriber so that it
+   doesn't conflict with incoming changes, or dropping the
conflicting constraint or
+   unique index, or writing a trigger on the subscriber to suppress or redirect
+   conflicting incoming changes, or as a last resort, by skipping the
whole transaction.
+   They skip the whole transaction, including changes that may not violate any
+   constraint.  They may easily make the subscriber inconsistent, especially if
+   a user specifies the wrong transaction ID or the position of origin.
+  </para>


How about rearranging it as follows:

+  <para>
+   These methods skip the whole transaction, including changes that
may not violate
+   any constraint. They may easily make the subscriber inconsistent,
especially if
+   a user specifies the wrong transaction ID or the position of
origin, and should
+   be used as a last resort.
+   Alternatively, you might consider changing the data on the
subscriber so that it
+   doesn't conflict with incoming changes, or dropping the
conflicting constraint or
+   unique index, or writing a trigger on the subscriber to suppress or redirect
+   conflicting incoming changes.
+  </para>


doc/src/sgml/ref/alter_subscription.sgml
(3)

(i) Doc needs clarification
BEFORE:
+      the whole transaction.  The logical replication worker skips all data
AFTER:
+      the whole transaction.  For the latter case, the logical
replication worker skips all data


(ii) "Setting -1 means to reset the transaction ID"

Shouldn't it be explained what resetting actually does and when it can
be, or is needed to be, done? Isn't it automatically reset?
I notice that negative values (other than -1) seem to be regarded as
valid - is that right?
Also, what happens if this option is set multiple times? Does it just
override and use the latest setting? (other option handling errors out
with errorConflictingDefElem()).
e.g. alter subscription sub skip (xid = 721, xid = 722);


src/backend/replication/logical/worker.c
(4) Shouldn't the "done skipping logical replication transaction"
message also include the skipped XID value at the end?


src/test/subscription/t/027_skip_xact.pl
(5) Some suggested wording improvements

(i)
BEFORE:
+# Test skipping the transaction. This function must be called after the caller
+# inserting data that conflict with the subscriber.  After waiting for the
+# subscription worker stats are updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication
can continue
+# working by inserting $nonconflict_data on the publisher.
AFTER:
+# Test skipping the transaction. This function must be called after the caller
+# inserts data that conflicts with the subscriber.  After waiting for the
+# subscription worker stats to be updated, we skip the transaction in question
+# by ALTER SUBSCRIPTION ... SKIP. Then, check if logical replication
can continue
+# working by inserting $nonconflict_data on the publisher.

(ii)
BEFORE:
+# will conflict with the data replicated from publisher later.
AFTER:
+# will conflict with the data replicated later from the publisher.


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

13 декабря 2021 г., 07:04:29

On Mon, Dec 13, 2021 at 8:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Dec 11, 2021 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > 3.
> > + * Also, we don't skip receiving the changes in streaming cases,
> > since we decide
> > + * whether or not to skip applying the changes when starting to apply changes.
> >
> > But why so? Can't we even skip streaming (and writing to file all such
> > messages)? If we can do this then we can avoid even collecting all
> > messages in a file.
>
> IIUC in streaming cases, a transaction can be sent to the subscriber
> while splitting into multiple chunks of changes. In the meanwhile,
> skip_xid can be changed. If the user changed or cleared skip_xid after
> the subscriber skips some streamed changes, the subscriber won't able
> to have complete changes of the transaction.
>

Yeah, I think if we want we can handle this by writing into the stream
xid file whether the changes need to be skipped and then the
consecutive streams can check that in the file or may be in some way
don't allow skip_xid to be changed in worker if it is already skipping
some xact. If we don't want to do anything for this then it is better
to at least reflect this reasoning in the comments.

> >
> > 4.
> > + * Also, one might think that we can skip preparing the skipped transaction.
> > + * But if we do that, PREPARE WAL record won’t be sent to its physical
> > + * standbys, resulting in that users won’t be able to find the prepared
> > + * transaction entry after a fail-over.
> > + *
> > ..
> > + */
> > + if (skipping_changes)
> > + stop_skipping_changes(false);
> >
> > Why do we need such a Prepare's entry either at current subscriber or
> > on its physical standby? I think it is to allow Commit-prepared. If
> > so, how about if we skip even commit prepared as well? Even on
> > physical standby, we would be having the value of skip_xid which can
> > help us to skip there as well after failover.
>
> It's true that skip_xid would be set also on physical standby. When it
> comes to preparing the skipped transaction on the current subscriber,
> if we want to skip commit-prepared I think we need protocol changes in
> order for subscribers to know prepare_lsn and preppare_timestampso
> that it can lookup the prepared transaction when doing
> commit-prepared. I proposed this idea before. This change would be
> benefical as of now since the publisher sends even empty transactions.
> But considering the proposed patch[1] that makes the puslisher not
> send empty transaction, this protocol change would be an optimization
> only for this feature.
>

I was thinking to compare the xid received as part of the
commit_prepared message with the value of skip_xid to skip the
commit_prepared but I guess the user would change it between prepare
and commit prepare and then we won't be able to detect it, right? I
think we can handle this and the streaming case if we disallow users
to change the value of skip_xid when we are already skipping changes
or don't let the new skip_xid to reflect in the apply worker if we are
already skipping some other transaction. What do you think?

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

13 декабря 2021 г., 16:24:38

On Mon, Dec 13, 2021 at 1:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 13, 2021 at 8:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, Dec 11, 2021 at 3:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > 3.
> > > + * Also, we don't skip receiving the changes in streaming cases,
> > > since we decide
> > > + * whether or not to skip applying the changes when starting to apply changes.
> > >
> > > But why so? Can't we even skip streaming (and writing to file all such
> > > messages)? If we can do this then we can avoid even collecting all
> > > messages in a file.
> >
> > IIUC in streaming cases, a transaction can be sent to the subscriber
> > while splitting into multiple chunks of changes. In the meanwhile,
> > skip_xid can be changed. If the user changed or cleared skip_xid after
> > the subscriber skips some streamed changes, the subscriber won't able
> > to have complete changes of the transaction.
> >
>
> Yeah, I think if we want we can handle this by writing into the stream
> xid file whether the changes need to be skipped and then the
> consecutive streams can check that in the file or may be in some way
> don't allow skip_xid to be changed in worker if it is already skipping
> some xact. If we don't want to do anything for this then it is better
> to at least reflect this reasoning in the comments.

Yes. Given that we still need to apply messages other than
data-modification messages, we need to skip writing only these changes
to the stream file.

>
> > >
> > > 4.
> > > + * Also, one might think that we can skip preparing the skipped transaction.
> > > + * But if we do that, PREPARE WAL record won’t be sent to its physical
> > > + * standbys, resulting in that users won’t be able to find the prepared
> > > + * transaction entry after a fail-over.
> > > + *
> > > ..
> > > + */
> > > + if (skipping_changes)
> > > + stop_skipping_changes(false);
> > >
> > > Why do we need such a Prepare's entry either at current subscriber or
> > > on its physical standby? I think it is to allow Commit-prepared. If
> > > so, how about if we skip even commit prepared as well? Even on
> > > physical standby, we would be having the value of skip_xid which can
> > > help us to skip there as well after failover.
> >
> > It's true that skip_xid would be set also on physical standby. When it
> > comes to preparing the skipped transaction on the current subscriber,
> > if we want to skip commit-prepared I think we need protocol changes in
> > order for subscribers to know prepare_lsn and preppare_timestampso
> > that it can lookup the prepared transaction when doing
> > commit-prepared. I proposed this idea before. This change would be
> > benefical as of now since the publisher sends even empty transactions.
> > But considering the proposed patch[1] that makes the puslisher not
> > send empty transaction, this protocol change would be an optimization
> > only for this feature.
> >
>
> I was thinking to compare the xid received as part of the
> commit_prepared message with the value of skip_xid to skip the
> commit_prepared but I guess the user would change it between prepare
> and commit prepare and then we won't be able to detect it, right? I
> think we can handle this and the streaming case if we disallow users
> to change the value of skip_xid when we are already skipping changes
> or don't let the new skip_xid to reflect in the apply worker if we are
> already skipping some other transaction. What do you think?

In streaming cases, we don’t know when stream-commit or stream-abort
comes and another conflict could occur on the subscription in the
meanwhile. But given that (we expect) this feature is used after the
apply worker enters into an error loop, this is unlikely to happen in
practice unless the user sets the wrong XID. Similarly, in 2PC cases,
we don’t know when commit-prepared or rollback-prepared comes and
another conflict could occur in the meanwhile. But this could occur in
practice even if the user specified the correct XID. Therefore, if we
disallow to change skip_xid until the subscriber receives
commit-prepared or rollback-prepared, we cannot skip the second
transaction that conflicts with data on the subscriber.

From the application perspective, which behavior is preferable between
skipping preparing a transaction and preparing an empty transaction,
in the first place? From the resource consumption etc., skipping
preparing transactions seems better. On the other hand, if we skipped
preparing the transaction, the application would not be able to find
the prepared transaction after a fail-over to the subscriber.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

14 декабря 2021 г., 05:49:44

On Mon, Dec 13, 2021 at 6:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Dec 13, 2021 at 1:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 13, 2021 at 8:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > > >
> > > > 4.
> > > > + * Also, one might think that we can skip preparing the skipped transaction.
> > > > + * But if we do that, PREPARE WAL record won’t be sent to its physical
> > > > + * standbys, resulting in that users won’t be able to find the prepared
> > > > + * transaction entry after a fail-over.
> > > > + *
> > > > ..
> > > > + */
> > > > + if (skipping_changes)
> > > > + stop_skipping_changes(false);
> > > >
> > > > Why do we need such a Prepare's entry either at current subscriber or
> > > > on its physical standby? I think it is to allow Commit-prepared. If
> > > > so, how about if we skip even commit prepared as well? Even on
> > > > physical standby, we would be having the value of skip_xid which can
> > > > help us to skip there as well after failover.
> > >
> > > It's true that skip_xid would be set also on physical standby. When it
> > > comes to preparing the skipped transaction on the current subscriber,
> > > if we want to skip commit-prepared I think we need protocol changes in
> > > order for subscribers to know prepare_lsn and preppare_timestampso
> > > that it can lookup the prepared transaction when doing
> > > commit-prepared. I proposed this idea before. This change would be
> > > benefical as of now since the publisher sends even empty transactions.
> > > But considering the proposed patch[1] that makes the puslisher not
> > > send empty transaction, this protocol change would be an optimization
> > > only for this feature.
> > >
> >
> > I was thinking to compare the xid received as part of the
> > commit_prepared message with the value of skip_xid to skip the
> > commit_prepared but I guess the user would change it between prepare
> > and commit prepare and then we won't be able to detect it, right? I
> > think we can handle this and the streaming case if we disallow users
> > to change the value of skip_xid when we are already skipping changes
> > or don't let the new skip_xid to reflect in the apply worker if we are
> > already skipping some other transaction. What do you think?
>
> In streaming cases, we don’t know when stream-commit or stream-abort
> comes and another conflict could occur on the subscription in the
> meanwhile. But given that (we expect) this feature is used after the
> apply worker enters into an error loop, this is unlikely to happen in
> practice unless the user sets the wrong XID. Similarly, in 2PC cases,
> we don’t know when commit-prepared or rollback-prepared comes and
> another conflict could occur in the meanwhile. But this could occur in
> practice even if the user specified the correct XID. Therefore, if we
> disallow to change skip_xid until the subscriber receives
> commit-prepared or rollback-prepared, we cannot skip the second
> transaction that conflicts with data on the subscriber.
>

I agree with this theory. Can we reflect this in comments so that in
the future we know why we didn't pursue this direction?

> From the application perspective, which behavior is preferable between
> skipping preparing a transaction and preparing an empty transaction,
> in the first place? From the resource consumption etc., skipping
> preparing transactions seems better. On the other hand, if we skipped
> preparing the transaction, the application would not be able to find
> the prepared transaction after a fail-over to the subscriber.
>

I am not sure how much it matters that such prepares are not present
because we wanted to some way skip the corresponding commit prepared
as well. I think your previous point is a good enough reason as to why
we should allow such prepares.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

14 декабря 2021 г., 07:23:06

On Fri, Dec 10, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > I am thinking that we can start a transaction, update the catalog,
> > > > commit that transaction. Then start a new one to update
> > > > origin_lsn/timestamp, finishprepared, and commit it. Now, if it
> > > > crashes after the first transaction, only commit prepared will be
> > > > resent again and this time we don't need to update the catalog as that
> > > > entry would be already cleared.
> > >
> > > Sounds good. In the crash case, it should be fine since we will just
> > > commit an empty transaction. The same is true for the case where
> > > skip_xid has been changed after skipping and preparing the transaction
> > > and before handling commit_prepared.
> > >
> > > Regarding the case where the user specifies XID of the transaction
> > > after it is prepared on the subscriber (i.g., the transaction is not
> > > empty), we won’t skip committing the prepared transaction. But I think
> > > that we don't need to support skipping already-prepared transaction
> > > since such transaction doesn't conflict with anything regardless of
> > > having changed or not.
> > >
> >
> > Yeah, this makes sense to me.
> >
>
> I've attached an updated patch. The new syntax is like "ALTER
> SUBSCRIPTION testsub SKIP (xid = '123')".
>
> I’ve been thinking we can do something safeguard for the case where
> the user specified the wrong xid. For example, can we somewhat use the
> stats in pg_stat_subscription_workers? An idea is that logical
> replication worker fetches the xid from the stats when reading the
> subscription and skips the transaction if the xid matches to
> subskipxid. That is, the worker checks the error reported by the
> worker previously working on the same subscription. The error could
> not be a conflict error (e.g., connection error etc.) or might have
> been cleared by the reset function, But given the worker is in an
> error loop, the worker can eventually get xid in question. We can
> prevent an unrelated transaction from being skipped unexpectedly. It
> seems not a stable solution though. Or it might be enough to warn
> users when they specified an XID that doesn’t match to last_error_xid.
> Anyway, I think it’s better to have more discussion on this. Any
> ideas?

While the worker is skipping one of the skip transactions specified by
the user and immediately if the user specifies another skip
transaction while the skipping of the transaction is in progress this
new value will be reset by the worker while clearing the skip xid. I
felt once the worker has identified the skip xid and is about to skip
the xid, the worker can acquire a lock to prevent concurrency issues:
+static void
+clear_subscription_skip_xid(void)
+{
+       Relation        rel;
+       HeapTuple       tup;
+       bool            nulls[Natts_pg_subscription];
+       bool            replaces[Natts_pg_subscription];
+       Datum           values[Natts_pg_subscription];
+
+       memset(values, 0, sizeof(values));
+       memset(nulls, false, sizeof(nulls));
+       memset(replaces, false, sizeof(replaces));
+
+       if (!IsTransactionState())
+               StartTransactionCommand();
+
+       rel = table_open(SubscriptionRelationId, RowExclusiveLock);
+
+       /* Fetch the existing tuple. */
+       tup = SearchSysCacheCopy1(SUBSCRIPTIONOID,
+
ObjectIdGetDatum(MySubscription->oid));
+
+       if (!HeapTupleIsValid(tup))
+               elog(ERROR, "subscription \"%s\" does not exist",
MySubscription->name);
+
+       /* Set subskipxid to null */
+       nulls[Anum_pg_subscription_subskipxid - 1] = true;
+       replaces[Anum_pg_subscription_subskipxid - 1] = true;
+
+       /* Update the system catalog to reset the skip XID */
+       tup = heap_modify_tuple(tup, RelationGetDescr(rel), values, nulls,
+                                                       replaces);
+       CatalogTupleUpdate(rel, &tup->t_self, tup);
+
+       heap_freetuple(tup);
+       table_close(rel, RowExclusiveLock);
+}

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

14 декабря 2021 г., 08:35:38

On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
>
> While the worker is skipping one of the skip transactions specified by
> the user and immediately if the user specifies another skip
> transaction while the skipping of the transaction is in progress this
> new value will be reset by the worker while clearing the skip xid. I
> felt once the worker has identified the skip xid and is about to skip
> the xid, the worker can acquire a lock to prevent concurrency issues:

That's a good point.
If only the last_error_xid could be skipped, then this wouldn't be an
issue, right?
If a different xid to skip is specified while the worker is currently
skipping a transaction, should that even be allowed?


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Dilip Kumar

Дата:

14 декабря 2021 г., 09:10:38

On Fri, Dec 3, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Skipping a whole transaction by specifying xid would be a good start.
> Ideally, we'd like to automatically skip only operations within the
> transaction that fail but it seems not easy to achieve. If we allow
> specifying operations and/or relations, probably multiple operations
> or relations need to be specified in some cases. Otherwise, the
> subscriber cannot continue logical replication if the transaction has
> multiple operations on different relations that fail. But similar to
> the idea of specifying multiple xids, we need to note the fact that
> user wouldn't know of the second operation failure unless the apply
> worker applies the change. So I'm not sure there are many use cases in
> practice where users can specify multiple operations and relations in
> order to skip applies that fail.

I think there would be use cases for specifying the relations or
operation, e.g. if the user finds an issue in inserting in a
particular relation then maybe based on some manual investigation he
founds that the table has some constraint due to that it is failing on
the subscriber side but on the publisher side that constraint is not
there so maybe the user is okay to skip the changes for this table and
not for other tables, or there might be a few more tables which are
designed based on the same principle and can have similar error so
isn't it good to provide an option to give the list of all such
tables.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Dilip Kumar

Дата:

14 декабря 2021 г., 10:37:02

On Tue, Dec 14, 2021 at 8:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 13, 2021 at 6:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

> > In streaming cases, we don’t know when stream-commit or stream-abort
> > comes and another conflict could occur on the subscription in the
> > meanwhile. But given that (we expect) this feature is used after the
> > apply worker enters into an error loop, this is unlikely to happen in
> > practice unless the user sets the wrong XID. Similarly, in 2PC cases,
> > we don’t know when commit-prepared or rollback-prepared comes and
> > another conflict could occur in the meanwhile. But this could occur in
> > practice even if the user specified the correct XID. Therefore, if we
> > disallow to change skip_xid until the subscriber receives
> > commit-prepared or rollback-prepared, we cannot skip the second
> > transaction that conflicts with data on the subscriber.
> >
>
> I agree with this theory. Can we reflect this in comments so that in
> the future we know why we didn't pursue this direction?

I might be missing something here, but for streaming, transaction
users can decide whether they wants to skip or not only once we start
applying no?  I mean only once we start applying the changes we can
get some errors and by that time we must be having all the changes for
the transaction.  So I do not understand the point we are trying to
discuss here?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

14 декабря 2021 г., 12:05:50

On Tue, Dec 14, 2021 at 1:07 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 14, 2021 at 8:20 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 13, 2021 at 6:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> > > In streaming cases, we don’t know when stream-commit or stream-abort
> > > comes and another conflict could occur on the subscription in the
> > > meanwhile. But given that (we expect) this feature is used after the
> > > apply worker enters into an error loop, this is unlikely to happen in
> > > practice unless the user sets the wrong XID. Similarly, in 2PC cases,
> > > we don’t know when commit-prepared or rollback-prepared comes and
> > > another conflict could occur in the meanwhile. But this could occur in
> > > practice even if the user specified the correct XID. Therefore, if we
> > > disallow to change skip_xid until the subscriber receives
> > > commit-prepared or rollback-prepared, we cannot skip the second
> > > transaction that conflicts with data on the subscriber.
> > >
> >
> > I agree with this theory. Can we reflect this in comments so that in
> > the future we know why we didn't pursue this direction?
>
> I might be missing something here, but for streaming, transaction
> users can decide whether they wants to skip or not only once we start
> applying no?  I mean only once we start applying the changes we can
> get some errors and by that time we must be having all the changes for
> the transaction.
>

That is right and as per my understanding, the patch is trying to
accomplish the same.

>  So I do not understand the point we are trying to
> discuss here?
>

The point is that whether we can skip the changes while streaming
itself like when we get the changes and write to a stream file. Now,
it is possible that streams from multiple transactions can be
interleaved and users can change the skip_xid in between. It is not
that we can't handle this but that would require a more complex design
and it doesn't seem worth it because we can anyway skip the changes
while applying as you mentioned in the previous paragraph.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

14 декабря 2021 г., 12:53:30

On Fri, Dec 10, 2021 at 11:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 9, 2021 at 6:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 9, 2021 at 2:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Dec 9, 2021 at 11:47 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > I am thinking that we can start a transaction, update the catalog,
> > > > commit that transaction. Then start a new one to update
> > > > origin_lsn/timestamp, finishprepared, and commit it. Now, if it
> > > > crashes after the first transaction, only commit prepared will be
> > > > resent again and this time we don't need to update the catalog as that
> > > > entry would be already cleared.
> > >
> > > Sounds good. In the crash case, it should be fine since we will just
> > > commit an empty transaction. The same is true for the case where
> > > skip_xid has been changed after skipping and preparing the transaction
> > > and before handling commit_prepared.
> > >
> > > Regarding the case where the user specifies XID of the transaction
> > > after it is prepared on the subscriber (i.g., the transaction is not
> > > empty), we won’t skip committing the prepared transaction. But I think
> > > that we don't need to support skipping already-prepared transaction
> > > since such transaction doesn't conflict with anything regardless of
> > > having changed or not.
> > >
> >
> > Yeah, this makes sense to me.
> >
>
> I've attached an updated patch. The new syntax is like "ALTER
> SUBSCRIPTION testsub SKIP (xid = '123')".
>
> I’ve been thinking we can do something safeguard for the case where
> the user specified the wrong xid. For example, can we somewhat use the
> stats in pg_stat_subscription_workers? An idea is that logical
> replication worker fetches the xid from the stats when reading the
> subscription and skips the transaction if the xid matches to
> subskipxid. That is, the worker checks the error reported by the
> worker previously working on the same subscription. The error could
> not be a conflict error (e.g., connection error etc.) or might have
> been cleared by the reset function, But given the worker is in an
> error loop, the worker can eventually get xid in question. We can
> prevent an unrelated transaction from being skipped unexpectedly. It
> seems not a stable solution though. Or it might be enough to warn
> users when they specified an XID that doesn’t match to last_error_xid.
> Anyway, I think it’s better to have more discussion on this. Any
> ideas?

Few comments:
1) Should we check if conflicting option is specified like others above:
+               else if (strcmp(defel->defname, "xid") == 0)
+               {
+                       char *xid_str = defGetString(defel);
+                       TransactionId xid;
+
+                       if (strcmp(xid_str, "-1") == 0)
+                       {
+                               /* Setting -1 to xid means to reset it */
+                               xid = InvalidTransactionId;
+                       }
+                       else
+                       {

2) Currently only superusers can set skip xid, we can add this in the
documentation:
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to set %s", "skip_xid")));

3) There is an extra tab before "The resolution can be done ...", it
can be removed.
+      Skip applying changes of the particular transaction.  If incoming data
+      violates any constraints the logical replication will stop until it is
+      resolved. The resolution can be done either by changing data on the
+      subscriber so that it doesn't conflict with incoming change or
by skipping
+      the whole transaction.  The logical replication worker skips all data

4) xid with -2 is currently allowed, may be it is ok. If it is fine we
can remove it from the fail section.
+-- fail
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
+ERROR:  invalid transaction id: 1.1
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = -2);
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 0);
+ERROR:  invalid transaction id: 0
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1);

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Dilip Kumar

Дата:

14 декабря 2021 г., 13:11:27

On Tue, Dec 14, 2021 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > >
> > > I agree with this theory. Can we reflect this in comments so that in
> > > the future we know why we didn't pursue this direction?
> >
> > I might be missing something here, but for streaming, transaction
> > users can decide whether they wants to skip or not only once we start
> > applying no?  I mean only once we start applying the changes we can
> > get some errors and by that time we must be having all the changes for
> > the transaction.
> >
>
> That is right and as per my understanding, the patch is trying to
> accomplish the same.
>
> >  So I do not understand the point we are trying to
> > discuss here?
> >
>
> The point is that whether we can skip the changes while streaming
> itself like when we get the changes and write to a stream file. Now,
> it is possible that streams from multiple transactions can be
> interleaved and users can change the skip_xid in between. It is not
> that we can't handle this but that would require a more complex design
> and it doesn't seem worth it because we can anyway skip the changes
> while applying as you mentioned in the previous paragraph.

Actually, I was trying to understand the use case for skipping while
streaming.  Actually, during streaming we are not doing any database
operation that means this will not generate any error.  So IIUC, there
is no use case for skipping while streaming itself? Is there any use
case which I am not aware of?


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

14 декабря 2021 г., 14:23:55

On Tue, Dec 14, 2021 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Dec 14, 2021 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > >
> > > > I agree with this theory. Can we reflect this in comments so that in
> > > > the future we know why we didn't pursue this direction?
> > >
> > > I might be missing something here, but for streaming, transaction
> > > users can decide whether they wants to skip or not only once we start
> > > applying no?  I mean only once we start applying the changes we can
> > > get some errors and by that time we must be having all the changes for
> > > the transaction.
> > >
> >
> > That is right and as per my understanding, the patch is trying to
> > accomplish the same.
> >
> > >  So I do not understand the point we are trying to
> > > discuss here?
> > >
> >
> > The point is that whether we can skip the changes while streaming
> > itself like when we get the changes and write to a stream file. Now,
> > it is possible that streams from multiple transactions can be
> > interleaved and users can change the skip_xid in between. It is not
> > that we can't handle this but that would require a more complex design
> > and it doesn't seem worth it because we can anyway skip the changes
> > while applying as you mentioned in the previous paragraph.
>
> Actually, I was trying to understand the use case for skipping while
> streaming.  Actually, during streaming we are not doing any database
> operation that means this will not generate any error.
>

Say, there is an error the first time when we start to apply changes
for such a transaction. So, such a transaction will be streamed again.
Say, the user has set the skip_xid before we stream a second time, so
this time, we can skip it either during the stream phase or apply
phase. I think the patch is skipping it during apply phase.
Sawada-San, please confirm if my understanding is correct?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 декабря 2021 г., 03:38:34

On Tue, Dec 14, 2021 at 8:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 14, 2021 at 3:41 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Dec 14, 2021 at 2:36 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > > >
> > > > > I agree with this theory. Can we reflect this in comments so that in
> > > > > the future we know why we didn't pursue this direction?
> > > >
> > > > I might be missing something here, but for streaming, transaction
> > > > users can decide whether they wants to skip or not only once we start
> > > > applying no?  I mean only once we start applying the changes we can
> > > > get some errors and by that time we must be having all the changes for
> > > > the transaction.
> > > >
> > >
> > > That is right and as per my understanding, the patch is trying to
> > > accomplish the same.
> > >
> > > >  So I do not understand the point we are trying to
> > > > discuss here?
> > > >
> > >
> > > The point is that whether we can skip the changes while streaming
> > > itself like when we get the changes and write to a stream file. Now,
> > > it is possible that streams from multiple transactions can be
> > > interleaved and users can change the skip_xid in between. It is not
> > > that we can't handle this but that would require a more complex design
> > > and it doesn't seem worth it because we can anyway skip the changes
> > > while applying as you mentioned in the previous paragraph.
> >
> > Actually, I was trying to understand the use case for skipping while
> > streaming.  Actually, during streaming we are not doing any database
> > operation that means this will not generate any error.
> >
>
> Say, there is an error the first time when we start to apply changes
> for such a transaction. So, such a transaction will be streamed again.
> Say, the user has set the skip_xid before we stream a second time, so
> this time, we can skip it either during the stream phase or apply
> phase. I think the patch is skipping it during apply phase.
> Sawada-San, please confirm if my understanding is correct?

My understanding is the same. The patch doesn't skip the streaming
phase but starts skipping when starting to apply changes. That is, we
receive streamed changes and write them to the stream file anyway
regardless of skip_xid. When receiving the stream-commit message, we
check whether or not we skip this transaction, and if so we apply all
messages in the stream file other than all data modification messages.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 декабря 2021 г., 05:49:11

On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > While the worker is skipping one of the skip transactions specified by
> > the user and immediately if the user specifies another skip
> > transaction while the skipping of the transaction is in progress this
> > new value will be reset by the worker while clearing the skip xid. I
> > felt once the worker has identified the skip xid and is about to skip
> > the xid, the worker can acquire a lock to prevent concurrency issues:
>
> That's a good point.
> If only the last_error_xid could be skipped, then this wouldn't be an
> issue, right?
> If a different xid to skip is specified while the worker is currently
> skipping a transaction, should that even be allowed?
>

We don't expect such usage but yes, it could happen and seems not
good. I thought we can acquire Share lock on pg_subscription during
the skip but not sure it's a good idea. It would be better if we can
find a way to allow users to specify only XID that has failed.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Dilip Kumar

Дата:

15 декабря 2021 г., 06:57:19

On Tue, Dec 14, 2021 at 4:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>

> > Actually, I was trying to understand the use case for skipping while
> > streaming.  Actually, during streaming we are not doing any database
> > operation that means this will not generate any error.
> >
>
> Say, there is an error the first time when we start to apply changes
> for such a transaction. So, such a transaction will be streamed again.
> Say, the user has set the skip_xid before we stream a second time, so
> this time, we can skip it either during the stream phase or apply
> phase. I think the patch is skipping it during apply phase.
> Sawada-San, please confirm if my understanding is correct?
>

Got it, thanks for clarifying.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

15 декабря 2021 г., 07:10:02

On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> >
> > On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > While the worker is skipping one of the skip transactions specified by
> > > the user and immediately if the user specifies another skip
> > > transaction while the skipping of the transaction is in progress this
> > > new value will be reset by the worker while clearing the skip xid. I
> > > felt once the worker has identified the skip xid and is about to skip
> > > the xid, the worker can acquire a lock to prevent concurrency issues:
> >
> > That's a good point.
> > If only the last_error_xid could be skipped, then this wouldn't be an
> > issue, right?
> > If a different xid to skip is specified while the worker is currently
> > skipping a transaction, should that even be allowed?
> >
>
> We don't expect such usage but yes, it could happen and seems not
> good. I thought we can acquire Share lock on pg_subscription during
> the skip but not sure it's a good idea. It would be better if we can
> find a way to allow users to specify only XID that has failed.
>

Yeah, but as we don't have a definite way to allow specifying only
failed XID, I think it is better to use share lock on that particular
subscription. We are already using it for add/update rel state (see,
AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
another place to use a similar technique.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

15 декабря 2021 г., 07:16:20

On Tue, Dec 14, 2021 at 11:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Fri, Dec 3, 2021 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Skipping a whole transaction by specifying xid would be a good start.
> > Ideally, we'd like to automatically skip only operations within the
> > transaction that fail but it seems not easy to achieve. If we allow
> > specifying operations and/or relations, probably multiple operations
> > or relations need to be specified in some cases. Otherwise, the
> > subscriber cannot continue logical replication if the transaction has
> > multiple operations on different relations that fail. But similar to
> > the idea of specifying multiple xids, we need to note the fact that
> > user wouldn't know of the second operation failure unless the apply
> > worker applies the change. So I'm not sure there are many use cases in
> > practice where users can specify multiple operations and relations in
> > order to skip applies that fail.
>
> I think there would be use cases for specifying the relations or
> operation, e.g. if the user finds an issue in inserting in a
> particular relation then maybe based on some manual investigation he
> founds that the table has some constraint due to that it is failing on
> the subscriber side but on the publisher side that constraint is not
> there so maybe the user is okay to skip the changes for this table and
> not for other tables, or there might be a few more tables which are
> designed based on the same principle and can have similar error so
> isn't it good to provide an option to give the list of all such
> tables.
>

That's right and I agree there could be some use case for it and even
specifying the operation but I think we can always extend the existing
feature for it if the need arises. Note that the user can anyway only
specify a single relation or an operation because there is a way to
know only one error and till that is resolved, the apply process won't
proceed. We have discussed providing these additional options in this
thread but thought of doing it later once we have the base feature and
based on the feedback from users.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Dilip Kumar

Дата:

15 декабря 2021 г., 07:45:08

On Wed, Dec 15, 2021 at 9:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Dec 14, 2021 at 11:40 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> That's right and I agree there could be some use case for it and even
> specifying the operation but I think we can always extend the existing
> feature for it if the need arises. Note that the user can anyway only
> specify a single relation or an operation because there is a way to
> know only one error and till that is resolved, the apply process won't
> proceed. We have discussed providing these additional options in this
> thread but thought of doing it later once we have the base feature and
> based on the feedback from users.

Yeah, I only wanted to make the point that this could be useful, it
seems we are on the same page.  I agree we can extend it in the future
as well.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

15 декабря 2021 г., 07:58:29

On Wed, Dec 15, 2021 at 1:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> We don't expect such usage but yes, it could happen and seems not
> good. I thought we can acquire Share lock on pg_subscription during
> the skip but not sure it's a good idea. It would be better if we can
> find a way to allow users to specify only XID that has failed.
>

Yes, I agree that would be better.
If you didn't do that, I think you'd need to queue the XIDs to be
skipped (rather than locking).

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 декабря 2021 г., 17:49:14

On Wed, Dec 15, 2021 at 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > >
> > > On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > While the worker is skipping one of the skip transactions specified by
> > > > the user and immediately if the user specifies another skip
> > > > transaction while the skipping of the transaction is in progress this
> > > > new value will be reset by the worker while clearing the skip xid. I
> > > > felt once the worker has identified the skip xid and is about to skip
> > > > the xid, the worker can acquire a lock to prevent concurrency issues:
> > >
> > > That's a good point.
> > > If only the last_error_xid could be skipped, then this wouldn't be an
> > > issue, right?
> > > If a different xid to skip is specified while the worker is currently
> > > skipping a transaction, should that even be allowed?
> > >
> >
> > We don't expect such usage but yes, it could happen and seems not
> > good. I thought we can acquire Share lock on pg_subscription during
> > the skip but not sure it's a good idea. It would be better if we can
> > find a way to allow users to specify only XID that has failed.
> >
>
> Yeah, but as we don't have a definite way to allow specifying only
> failed XID, I think it is better to use share lock on that particular
> subscription. We are already using it for add/update rel state (see,
> AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
> another place to use a similar technique.

Yes, but it seems to mean that we disallow users to change skip_xid
while the apply worker is skipping changes so we will end up having
the same problem we discussed so far;

In the current patch, we don't clear skip_xid at prepare time but do
that at commit-prepare time. But we cannot keep holding the lock until
commit-prepared comes because we don’t know when commit-prepared
comes. It’s possible that another conflict occurs before the
commit-prepared comes. We also cannot only clear skip_xid at prepare
time because it doesn’t solve the concurrency problem at
commit-prepared time. So if my understanding is correct, we need to
both clear skip_xid and unlock the lock at prepare time, and commit
the prepared (empty) transaction at commit-prepared time (I assume
that we prepare even empty transactions).

Suppose that at prepare time, we clear skip_xid (and release the lock)
and then prepare the transaction, if the server crashes right after
clearing skip_xid, skip_xid is already cleared but the transaction
will be sent again. The user has to specify skip_xid again. So let’s
change the order; we prepare the transaction and then clear skip_xid.
But if the server crashes between them, the transaction won’t be sent
again, but skip_xid is left. The user has to clear it. The left
skip_xid can automatically be cleared at commit-prepared time if XID
in the commit-prepared message matches skip_xid, but this actually
doesn’t solve the concurrency problem. If the user changed skip_xid
before commit-prepared, we would end up clearing the value. So we
might want to hold the lock until we clear skip_xid but we want to
avoid that as I explained first. It seems like we entered a loop.

It sounds better among these ideas that we clear skip_xid and then
prepare the transaction. Or we might want to revisit the idea of
storing skip_xid on shmem (e.g., ReplicationState) instead of the
catalog.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

16 декабря 2021 г., 05:42:57

On Wed, Dec 15, 2021 at 8:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Dec 15, 2021 at 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > >
> > > > On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > While the worker is skipping one of the skip transactions specified by
> > > > > the user and immediately if the user specifies another skip
> > > > > transaction while the skipping of the transaction is in progress this
> > > > > new value will be reset by the worker while clearing the skip xid. I
> > > > > felt once the worker has identified the skip xid and is about to skip
> > > > > the xid, the worker can acquire a lock to prevent concurrency issues:
> > > >
> > > > That's a good point.
> > > > If only the last_error_xid could be skipped, then this wouldn't be an
> > > > issue, right?
> > > > If a different xid to skip is specified while the worker is currently
> > > > skipping a transaction, should that even be allowed?
> > > >
> > >
> > > We don't expect such usage but yes, it could happen and seems not
> > > good. I thought we can acquire Share lock on pg_subscription during
> > > the skip but not sure it's a good idea. It would be better if we can
> > > find a way to allow users to specify only XID that has failed.
> > >
> >
> > Yeah, but as we don't have a definite way to allow specifying only
> > failed XID, I think it is better to use share lock on that particular
> > subscription. We are already using it for add/update rel state (see,
> > AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
> > another place to use a similar technique.
>
> Yes, but it seems to mean that we disallow users to change skip_xid
> while the apply worker is skipping changes so we will end up having
> the same problem we discussed so far;
>

I thought we just want to lock before clearing the skip_xid something
like take the lock, check if the skip_xid in the catalog is the same
as we have skipped, if it is the same then clear it, otherwise, leave
it as it is. How will that disallow users to change skip_xid when we
are skipping changes?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

16 декабря 2021 г., 08:06:34

On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Dec 15, 2021 at 8:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Dec 15, 2021 at 1:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Dec 15, 2021 at 8:19 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Tue, Dec 14, 2021 at 2:35 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
> > > > >
> > > > > On Tue, Dec 14, 2021 at 3:23 PM vignesh C <vignesh21@gmail.com> wrote:
> > > > > >
> > > > > > While the worker is skipping one of the skip transactions specified by
> > > > > > the user and immediately if the user specifies another skip
> > > > > > transaction while the skipping of the transaction is in progress this
> > > > > > new value will be reset by the worker while clearing the skip xid. I
> > > > > > felt once the worker has identified the skip xid and is about to skip
> > > > > > the xid, the worker can acquire a lock to prevent concurrency issues:
> > > > >
> > > > > That's a good point.
> > > > > If only the last_error_xid could be skipped, then this wouldn't be an
> > > > > issue, right?
> > > > > If a different xid to skip is specified while the worker is currently
> > > > > skipping a transaction, should that even be allowed?
> > > > >
> > > >
> > > > We don't expect such usage but yes, it could happen and seems not
> > > > good. I thought we can acquire Share lock on pg_subscription during
> > > > the skip but not sure it's a good idea. It would be better if we can
> > > > find a way to allow users to specify only XID that has failed.
> > > >
> > >
> > > Yeah, but as we don't have a definite way to allow specifying only
> > > failed XID, I think it is better to use share lock on that particular
> > > subscription. We are already using it for add/update rel state (see,
> > > AddSubscriptionRelState, UpdateSubscriptionRelState), so this will be
> > > another place to use a similar technique.
> >
> > Yes, but it seems to mean that we disallow users to change skip_xid
> > while the apply worker is skipping changes so we will end up having
> > the same problem we discussed so far;
> >
>
> I thought we just want to lock before clearing the skip_xid something
> like take the lock, check if the skip_xid in the catalog is the same
> as we have skipped, if it is the same then clear it, otherwise, leave
> it as it is. How will that disallow users to change skip_xid when we
> are skipping changes?

Oh I thought we wanted to keep holding the lock while skipping changes
(changing skip_xid requires acquiring the lock).

So if skip_xid is already changed, the apply worker would do
replorigin_advance() with WAL logging, instead of committing the
catalog change?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

16 декабря 2021 г., 08:21:32

On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I thought we just want to lock before clearing the skip_xid something
> > like take the lock, check if the skip_xid in the catalog is the same
> > as we have skipped, if it is the same then clear it, otherwise, leave
> > it as it is. How will that disallow users to change skip_xid when we
> > are skipping changes?
>
> Oh I thought we wanted to keep holding the lock while skipping changes
> (changing skip_xid requires acquiring the lock).
>
> So if skip_xid is already changed, the apply worker would do
> replorigin_advance() with WAL logging, instead of committing the
> catalog change?
>

Right. BTW, how are you planning to advance the origin? Normally, a
commit transaction would do it but when we are skipping all changes,
the commit might not do it as there won't be any transaction id
assigned.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

16 декабря 2021 г., 08:42:20

On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I thought we just want to lock before clearing the skip_xid something
> > > like take the lock, check if the skip_xid in the catalog is the same
> > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > it as it is. How will that disallow users to change skip_xid when we
> > > are skipping changes?
> >
> > Oh I thought we wanted to keep holding the lock while skipping changes
> > (changing skip_xid requires acquiring the lock).
> >
> > So if skip_xid is already changed, the apply worker would do
> > replorigin_advance() with WAL logging, instead of committing the
> > catalog change?
> >
>
> Right. BTW, how are you planning to advance the origin? Normally, a
> commit transaction would do it but when we are skipping all changes,
> the commit might not do it as there won't be any transaction id
> assigned.

I've not tested it yet but replorigin_advance() with wal_log = true
seems to work for this case.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

17 декабря 2021 г., 12:53:48

On 13.12.21 04:12, Greg Nancarrow wrote:
> (ii) "Setting -1 means to reset the transaction ID"
> 
> Shouldn't it be explained what resetting actually does and when it can
> be, or is needed to be, done? Isn't it automatically reset?
> I notice that negative values (other than -1) seem to be regarded as
> valid - is that right?
> Also, what happens if this option is set multiple times? Does it just
> override and use the latest setting? (other option handling errors out
> with errorConflictingDefElem()).
> e.g. alter subscription sub skip (xid = 721, xid = 722);

Let's not use magic numbers and instead use a syntax that is more 
explicit, like SKIP (xid = NONE) or RESET SKIP or something like that.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

17 декабря 2021 г., 13:12:10

On Fri, Dec 17, 2021 at 3:23 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 13.12.21 04:12, Greg Nancarrow wrote:
> > (ii) "Setting -1 means to reset the transaction ID"
> >
> > Shouldn't it be explained what resetting actually does and when it can
> > be, or is needed to be, done? Isn't it automatically reset?
> > I notice that negative values (other than -1) seem to be regarded as
> > valid - is that right?
> > Also, what happens if this option is set multiple times? Does it just
> > override and use the latest setting? (other option handling errors out
> > with errorConflictingDefElem()).
> > e.g. alter subscription sub skip (xid = 721, xid = 722);
>
> Let's not use magic numbers and instead use a syntax that is more
> explicit, like SKIP (xid = NONE) or RESET SKIP or something like that.
>

+1 for using SKIP (xid = NONE) because otherwise first we need to
introduce RESET syntax for this command.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 декабря 2021 г., 14:13:11

On Fri, Dec 17, 2021 at 7:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 17, 2021 at 3:23 PM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> >
> > On 13.12.21 04:12, Greg Nancarrow wrote:
> > > (ii) "Setting -1 means to reset the transaction ID"
> > >
> > > Shouldn't it be explained what resetting actually does and when it can
> > > be, or is needed to be, done? Isn't it automatically reset?
> > > I notice that negative values (other than -1) seem to be regarded as
> > > valid - is that right?
> > > Also, what happens if this option is set multiple times? Does it just
> > > override and use the latest setting? (other option handling errors out
> > > with errorConflictingDefElem()).
> > > e.g. alter subscription sub skip (xid = 721, xid = 722);
> >
> > Let's not use magic numbers and instead use a syntax that is more
> > explicit, like SKIP (xid = NONE) or RESET SKIP or something like that.
> >
>
> +1 for using SKIP (xid = NONE) because otherwise first we need to
> introduce RESET syntax for this command.

Agreed. Thank you for the comment!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

27 декабря 2021 г., 07:23:36

On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > I thought we just want to lock before clearing the skip_xid something
> > > > like take the lock, check if the skip_xid in the catalog is the same
> > > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > > it as it is. How will that disallow users to change skip_xid when we
> > > > are skipping changes?
> > >
> > > Oh I thought we wanted to keep holding the lock while skipping changes
> > > (changing skip_xid requires acquiring the lock).
> > >
> > > So if skip_xid is already changed, the apply worker would do
> > > replorigin_advance() with WAL logging, instead of committing the
> > > catalog change?
> > >
> >
> > Right. BTW, how are you planning to advance the origin? Normally, a
> > commit transaction would do it but when we are skipping all changes,
> > the commit might not do it as there won't be any transaction id
> > assigned.
>
> I've not tested it yet but replorigin_advance() with wal_log = true
> seems to work for this case.

I've tested it and realized that we cannot use replorigin_advance()
for this purpose without changes. That is, the current
replorigin_advance() doesn't allow to advance the origin by the owner:

        /* Make sure it's not used by somebody else */
        if (replication_state->acquired_by != 0)
        {
            ereport(ERROR,
                    (errcode(ERRCODE_OBJECT_IN_USE),
                     errmsg("replication origin with OID %d is already
active for PID %d",
                            replication_state->roident,
                            replication_state->acquired_by)));
        }

So we need to change it so that the origin owner can advance its
origin, which makes sense to me.

Also, when we have to update the origin instead of committing the
catalog change while updating the origin, we cannot record the origin
timestamp. This behavior makes sense to me because we skipped the
transaction. But ISTM it’s not good if we emit the origin timestamp
only when directly updating the origin. So probably we need to always
omit origin timestamp.

Apart from that, I'm vaguely concerned that the logic seems to be
getting complex. Probably it comes from the fact that we store
skip_xid in the catalog and update the catalog to clear/set the
skip_xid. It might be worth revisiting the idea of storing skip_xid on
shmem (e.g., ReplicationState)? That way, we can always advance the
origin by replorigin_advance() and don’t need to worry about a complex
case like the server crashes during preparing the transaction. I’ve
not considered the downside yet enough, though.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

05 января 2022 г., 06:30:57

On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > I thought we just want to lock before clearing the skip_xid something
> > > > > like take the lock, check if the skip_xid in the catalog is the same
> > > > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > > > it as it is. How will that disallow users to change skip_xid when we
> > > > > are skipping changes?
> > > >
> > > > Oh I thought we wanted to keep holding the lock while skipping changes
> > > > (changing skip_xid requires acquiring the lock).
> > > >
> > > > So if skip_xid is already changed, the apply worker would do
> > > > replorigin_advance() with WAL logging, instead of committing the
> > > > catalog change?
> > > >
> > >
> > > Right. BTW, how are you planning to advance the origin? Normally, a
> > > commit transaction would do it but when we are skipping all changes,
> > > the commit might not do it as there won't be any transaction id
> > > assigned.
> >
> > I've not tested it yet but replorigin_advance() with wal_log = true
> > seems to work for this case.
>
> I've tested it and realized that we cannot use replorigin_advance()
> for this purpose without changes. That is, the current
> replorigin_advance() doesn't allow to advance the origin by the owner:
>
>         /* Make sure it's not used by somebody else */
>         if (replication_state->acquired_by != 0)
>         {
>             ereport(ERROR,
>                     (errcode(ERRCODE_OBJECT_IN_USE),
>                      errmsg("replication origin with OID %d is already
> active for PID %d",
>                             replication_state->roident,
>                             replication_state->acquired_by)));
>         }
>
> So we need to change it so that the origin owner can advance its
> origin, which makes sense to me.
>
> Also, when we have to update the origin instead of committing the
> catalog change while updating the origin, we cannot record the origin
> timestamp.
>

Is it because we currently update the origin timestamp with commit record?

> This behavior makes sense to me because we skipped the
> transaction. But ISTM it’s not good if we emit the origin timestamp
> only when directly updating the origin. So probably we need to always
> omit origin timestamp.
>

Do you mean to say that you want to omit it even when we are
committing the changes?

> Apart from that, I'm vaguely concerned that the logic seems to be
> getting complex. Probably it comes from the fact that we store
> skip_xid in the catalog and update the catalog to clear/set the
> skip_xid. It might be worth revisiting the idea of storing skip_xid on
> shmem (e.g., ReplicationState)?
>

IIRC, the problem with that idea was that we won't remember skip_xid
information after server restart and the user won't even know that it
has to set it again.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Dilip Kumar

Дата:

05 января 2022 г., 07:18:27

On Wed, Jan 5, 2022 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Do you mean to say that you want to omit it even when we are
> committing the changes?
>
> > Apart from that, I'm vaguely concerned that the logic seems to be
> > getting complex. Probably it comes from the fact that we store
> > skip_xid in the catalog and update the catalog to clear/set the
> > skip_xid. It might be worth revisiting the idea of storing skip_xid on
> > shmem (e.g., ReplicationState)?
> >
>
> IIRC, the problem with that idea was that we won't remember skip_xid
> information after server restart and the user won't even know that it
> has to set it again.

I agree, that if we don't keep it in the catalog then after restart if
the transaction replayed again then the user has to set the skip xid
again and that would be pretty inconvenient because the user might
have to analyze the failure again and repeat the same process he did
before restart.  But OTOH the combination of restart and the skip xid
might not be very frequent so this might not be a very bad option.
Basically, I am in favor of storing it in a catalog as that solution
looks cleaner at least from the user pov but if we think there are a
lot of complexities from the implementation pov then we might analyze
the approach of storing in shmem as well.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

06 января 2022 г., 07:57:27

On Wed, Jan 5, 2022 at 9:48 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Jan 5, 2022 at 9:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Do you mean to say that you want to omit it even when we are
> > committing the changes?
> >
> > > Apart from that, I'm vaguely concerned that the logic seems to be
> > > getting complex. Probably it comes from the fact that we store
> > > skip_xid in the catalog and update the catalog to clear/set the
> > > skip_xid. It might be worth revisiting the idea of storing skip_xid on
> > > shmem (e.g., ReplicationState)?
> > >
> >
> > IIRC, the problem with that idea was that we won't remember skip_xid
> > information after server restart and the user won't even know that it
> > has to set it again.
>
>
> I agree, that if we don't keep it in the catalog then after restart if
> the transaction replayed again then the user has to set the skip xid
> again and that would be pretty inconvenient because the user might
> have to analyze the failure again and repeat the same process he did
> before restart.  But OTOH the combination of restart and the skip xid
> might not be very frequent so this might not be a very bad option.
> Basically, I am in favor of storing it in a catalog as that solution
> looks cleaner at least from the user pov but if we think there are a
> lot of complexities from the implementation pov then we might analyze
> the approach of storing in shmem as well.
>

Fair point, but I think it is better to see the patch or the problems
that can't be solved if we pursue storing it in catalog. Even, if we
decide to store it in shmem, we need to invent some way to inform the
user that we have not honored the previous setting of skip_xid and it
needs to be reset again.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

07 января 2022 г., 04:04:58

On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > I thought we just want to lock before clearing the skip_xid something
> > > > > > like take the lock, check if the skip_xid in the catalog is the same
> > > > > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > > > > it as it is. How will that disallow users to change skip_xid when we
> > > > > > are skipping changes?
> > > > >
> > > > > Oh I thought we wanted to keep holding the lock while skipping changes
> > > > > (changing skip_xid requires acquiring the lock).
> > > > >
> > > > > So if skip_xid is already changed, the apply worker would do
> > > > > replorigin_advance() with WAL logging, instead of committing the
> > > > > catalog change?
> > > > >
> > > >
> > > > Right. BTW, how are you planning to advance the origin? Normally, a
> > > > commit transaction would do it but when we are skipping all changes,
> > > > the commit might not do it as there won't be any transaction id
> > > > assigned.
> > >
> > > I've not tested it yet but replorigin_advance() with wal_log = true
> > > seems to work for this case.
> >
> > I've tested it and realized that we cannot use replorigin_advance()
> > for this purpose without changes. That is, the current
> > replorigin_advance() doesn't allow to advance the origin by the owner:
> >
> >         /* Make sure it's not used by somebody else */
> >         if (replication_state->acquired_by != 0)
> >         {
> >             ereport(ERROR,
> >                     (errcode(ERRCODE_OBJECT_IN_USE),
> >                      errmsg("replication origin with OID %d is already
> > active for PID %d",
> >                             replication_state->roident,
> >                             replication_state->acquired_by)));
> >         }
> >
> > So we need to change it so that the origin owner can advance its
> > origin, which makes sense to me.
> >
> > Also, when we have to update the origin instead of committing the
> > catalog change while updating the origin, we cannot record the origin
> > timestamp.
> >
>
> Is it because we currently update the origin timestamp with commit record?

Yes.

>
> > This behavior makes sense to me because we skipped the
> > transaction. But ISTM it’s not good if we emit the origin timestamp
> > only when directly updating the origin. So probably we need to always
> > omit origin timestamp.
> >
>
> Do you mean to say that you want to omit it even when we are
> committing the changes?

Yes, it would be better to record only origin lsn in terms of consistency.

>
> > Apart from that, I'm vaguely concerned that the logic seems to be
> > getting complex. Probably it comes from the fact that we store
> > skip_xid in the catalog and update the catalog to clear/set the
> > skip_xid. It might be worth revisiting the idea of storing skip_xid on
> > shmem (e.g., ReplicationState)?
> >
>
> IIRC, the problem with that idea was that we won't remember skip_xid
> information after server restart and the user won't even know that it
> has to set it again.

Right, I agree that it’s not convenient when the server restarts or
crashes, but these problems could not be critical in the situation
where users have to use this feature; the subscriber already entered
an error loop so they can know xid again and it’s an uncommon case
that they need to restart during skipping changes.

Anyway, I'll submit an updated patch soon so we can discuss complexity
vs. convenience.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

07 января 2022 г., 07:22:56

On Fri, Jan 7, 2022 at 6:35 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > > I thought we just want to lock before clearing the skip_xid something
> > > > > > > like take the lock, check if the skip_xid in the catalog is the same
> > > > > > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > > > > > it as it is. How will that disallow users to change skip_xid when we
> > > > > > > are skipping changes?
> > > > > >
> > > > > > Oh I thought we wanted to keep holding the lock while skipping changes
> > > > > > (changing skip_xid requires acquiring the lock).
> > > > > >
> > > > > > So if skip_xid is already changed, the apply worker would do
> > > > > > replorigin_advance() with WAL logging, instead of committing the
> > > > > > catalog change?
> > > > > >
> > > > >
> > > > > Right. BTW, how are you planning to advance the origin? Normally, a
> > > > > commit transaction would do it but when we are skipping all changes,
> > > > > the commit might not do it as there won't be any transaction id
> > > > > assigned.
> > > >
> > > > I've not tested it yet but replorigin_advance() with wal_log = true
> > > > seems to work for this case.
> > >
> > > I've tested it and realized that we cannot use replorigin_advance()
> > > for this purpose without changes. That is, the current
> > > replorigin_advance() doesn't allow to advance the origin by the owner:
> > >
> > >         /* Make sure it's not used by somebody else */
> > >         if (replication_state->acquired_by != 0)
> > >         {
> > >             ereport(ERROR,
> > >                     (errcode(ERRCODE_OBJECT_IN_USE),
> > >                      errmsg("replication origin with OID %d is already
> > > active for PID %d",
> > >                             replication_state->roident,
> > >                             replication_state->acquired_by)));
> > >         }
> > >
> > > So we need to change it so that the origin owner can advance its
> > > origin, which makes sense to me.
> > >
> > > Also, when we have to update the origin instead of committing the
> > > catalog change while updating the origin, we cannot record the origin
> > > timestamp.
> > >
> >
> > Is it because we currently update the origin timestamp with commit record?
>
> Yes.
>
> >
> > > This behavior makes sense to me because we skipped the
> > > transaction. But ISTM it’s not good if we emit the origin timestamp
> > > only when directly updating the origin. So probably we need to always
> > > omit origin timestamp.
> > >
> >
> > Do you mean to say that you want to omit it even when we are
> > committing the changes?
>
> Yes, it would be better to record only origin lsn in terms of consistency.
>

I am not so sure about this point because then what purpose origin
timestamp will serve in the code.

> >
> > > Apart from that, I'm vaguely concerned that the logic seems to be
> > > getting complex. Probably it comes from the fact that we store
> > > skip_xid in the catalog and update the catalog to clear/set the
> > > skip_xid. It might be worth revisiting the idea of storing skip_xid on
> > > shmem (e.g., ReplicationState)?
> > >
> >
> > IIRC, the problem with that idea was that we won't remember skip_xid
> > information after server restart and the user won't even know that it
> > has to set it again.
>
> Right, I agree that it’s not convenient when the server restarts or
> crashes, but these problems could not be critical in the situation
> where users have to use this feature; the subscriber already entered
> an error loop so they can know xid again and it’s an uncommon case
> that they need to restart during skipping changes.
>
> Anyway, I'll submit an updated patch soon so we can discuss complexity
> vs. convenience.
>

Okay, that sounds reasonable.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

07 января 2022 г., 08:52:52

On Fri, Jan 7, 2022 at 10:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > > I thought we just want to lock before clearing the skip_xid something
> > > > > > > like take the lock, check if the skip_xid in the catalog is the same
> > > > > > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > > > > > it as it is. How will that disallow users to change skip_xid when we
> > > > > > > are skipping changes?
> > > > > >
> > > > > > Oh I thought we wanted to keep holding the lock while skipping changes
> > > > > > (changing skip_xid requires acquiring the lock).
> > > > > >
> > > > > > So if skip_xid is already changed, the apply worker would do
> > > > > > replorigin_advance() with WAL logging, instead of committing the
> > > > > > catalog change?
> > > > > >
> > > > >
> > > > > Right. BTW, how are you planning to advance the origin? Normally, a
> > > > > commit transaction would do it but when we are skipping all changes,
> > > > > the commit might not do it as there won't be any transaction id
> > > > > assigned.
> > > >
> > > > I've not tested it yet but replorigin_advance() with wal_log = true
> > > > seems to work for this case.
> > >
> > > I've tested it and realized that we cannot use replorigin_advance()
> > > for this purpose without changes. That is, the current
> > > replorigin_advance() doesn't allow to advance the origin by the owner:
> > >
> > >         /* Make sure it's not used by somebody else */
> > >         if (replication_state->acquired_by != 0)
> > >         {
> > >             ereport(ERROR,
> > >                     (errcode(ERRCODE_OBJECT_IN_USE),
> > >                      errmsg("replication origin with OID %d is already
> > > active for PID %d",
> > >                             replication_state->roident,
> > >                             replication_state->acquired_by)));
> > >         }
> > >
> > > So we need to change it so that the origin owner can advance its
> > > origin, which makes sense to me.
> > >
> > > Also, when we have to update the origin instead of committing the
> > > catalog change while updating the origin, we cannot record the origin
> > > timestamp.
> > >
> >
> > Is it because we currently update the origin timestamp with commit record?
>
> Yes.
>
> >
> > > This behavior makes sense to me because we skipped the
> > > transaction. But ISTM it’s not good if we emit the origin timestamp
> > > only when directly updating the origin. So probably we need to always
> > > omit origin timestamp.
> > >
> >
> > Do you mean to say that you want to omit it even when we are
> > committing the changes?
>
> Yes, it would be better to record only origin lsn in terms of consistency.
>
> >
> > > Apart from that, I'm vaguely concerned that the logic seems to be
> > > getting complex. Probably it comes from the fact that we store
> > > skip_xid in the catalog and update the catalog to clear/set the
> > > skip_xid. It might be worth revisiting the idea of storing skip_xid on
> > > shmem (e.g., ReplicationState)?
> > >
> >
> > IIRC, the problem with that idea was that we won't remember skip_xid
> > information after server restart and the user won't even know that it
> > has to set it again.
>
> Right, I agree that it’s not convenient when the server restarts or
> crashes, but these problems could not be critical in the situation
> where users have to use this feature; the subscriber already entered
> an error loop so they can know xid again and it’s an uncommon case
> that they need to restart during skipping changes.
>
> Anyway, I'll submit an updated patch soon so we can discuss complexity
> vs. convenience.

Attached an updated patch. Please review it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v2-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

10 января 2022 г., 12:27:28

On Fri, Jan 7, 2022 at 11:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jan 7, 2022 at 10:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > >
> > > > > > > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > > >
> > > > > > > > I thought we just want to lock before clearing the skip_xid something
> > > > > > > > like take the lock, check if the skip_xid in the catalog is the same
> > > > > > > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > > > > > > it as it is. How will that disallow users to change skip_xid when we
> > > > > > > > are skipping changes?
> > > > > > >
> > > > > > > Oh I thought we wanted to keep holding the lock while skipping changes
> > > > > > > (changing skip_xid requires acquiring the lock).
> > > > > > >
> > > > > > > So if skip_xid is already changed, the apply worker would do
> > > > > > > replorigin_advance() with WAL logging, instead of committing the
> > > > > > > catalog change?
> > > > > > >
> > > > > >
> > > > > > Right. BTW, how are you planning to advance the origin? Normally, a
> > > > > > commit transaction would do it but when we are skipping all changes,
> > > > > > the commit might not do it as there won't be any transaction id
> > > > > > assigned.
> > > > >
> > > > > I've not tested it yet but replorigin_advance() with wal_log = true
> > > > > seems to work for this case.
> > > >
> > > > I've tested it and realized that we cannot use replorigin_advance()
> > > > for this purpose without changes. That is, the current
> > > > replorigin_advance() doesn't allow to advance the origin by the owner:
> > > >
> > > >         /* Make sure it's not used by somebody else */
> > > >         if (replication_state->acquired_by != 0)
> > > >         {
> > > >             ereport(ERROR,
> > > >                     (errcode(ERRCODE_OBJECT_IN_USE),
> > > >                      errmsg("replication origin with OID %d is already
> > > > active for PID %d",
> > > >                             replication_state->roident,
> > > >                             replication_state->acquired_by)));
> > > >         }
> > > >
> > > > So we need to change it so that the origin owner can advance its
> > > > origin, which makes sense to me.
> > > >
> > > > Also, when we have to update the origin instead of committing the
> > > > catalog change while updating the origin, we cannot record the origin
> > > > timestamp.
> > > >
> > >
> > > Is it because we currently update the origin timestamp with commit record?
> >
> > Yes.
> >
> > >
> > > > This behavior makes sense to me because we skipped the
> > > > transaction. But ISTM it’s not good if we emit the origin timestamp
> > > > only when directly updating the origin. So probably we need to always
> > > > omit origin timestamp.
> > > >
> > >
> > > Do you mean to say that you want to omit it even when we are
> > > committing the changes?
> >
> > Yes, it would be better to record only origin lsn in terms of consistency.
> >
> > >
> > > > Apart from that, I'm vaguely concerned that the logic seems to be
> > > > getting complex. Probably it comes from the fact that we store
> > > > skip_xid in the catalog and update the catalog to clear/set the
> > > > skip_xid. It might be worth revisiting the idea of storing skip_xid on
> > > > shmem (e.g., ReplicationState)?
> > > >
> > >
> > > IIRC, the problem with that idea was that we won't remember skip_xid
> > > information after server restart and the user won't even know that it
> > > has to set it again.
> >
> > Right, I agree that it’s not convenient when the server restarts or
> > crashes, but these problems could not be critical in the situation
> > where users have to use this feature; the subscriber already entered
> > an error loop so they can know xid again and it’s an uncommon case
> > that they need to restart during skipping changes.
> >
> > Anyway, I'll submit an updated patch soon so we can discuss complexity
> > vs. convenience.
>
> Attached an updated patch. Please review it.

Thanks for the updated patch, few comments:
1) Should this be case insensitive to support NONE too:
+                       /* Setting xid = NONE is treated as resetting xid */
+                       if (strcmp(xid_str, "none") == 0)
+                               xid = InvalidTransactionId;

2) Can we have an option to specify last_error_xid of
pg_stat_subscription_workers. Something like:
alter subscription sub1 skip ( XID = 'last_subscription_error');

When the user specified last_subscription_error, it should pick
last_error_xid from pg_stat_subscription_workers.
As this operation is a critical operation, if there is an option which
could automatically pick and set from pg_stat_subscription_workers, it
would be useful.

3) Currently the following syntax is being supported, I felt this
should throw an error:
postgres=# alter subscription sub1 set ( XID = 100);
ALTER SUBSCRIPTION

4) You might need to rebase the patch:
git am v2-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch
Applying: Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
subscriber nodes
error: patch failed: doc/src/sgml/logical-replication.sgml:333
error: doc/src/sgml/logical-replication.sgml: patch does not apply
Patch failed at 0001 Add ALTER SUBSCRIPTION ... SKIP to skip the
transaction on subscriber nodes
hint: Use 'git am --show-current-patch=diff' to see the failed patch

5) You might have to rename 027_skip_xact to 028_skip_xact as
027_nosuperuser.pl already exists
diff --git a/src/test/subscription/t/027_skip_xact.pl
b/src/test/subscription/t/027_skip_xact.pl
new file mode 100644
index 0000000000..a63c9c345e
--- /dev/null
+++ b/src/test/subscription/t/027_skip_xact.pl

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

10 января 2022 г., 14:50:16

On Thu, Dec 16, 2021 at 11:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > >
> > > So if skip_xid is already changed, the apply worker would do
> > > replorigin_advance() with WAL logging, instead of committing the
> > > catalog change?
> > >
> >
> > Right. BTW, how are you planning to advance the origin? Normally, a
> > commit transaction would do it but when we are skipping all changes,
> > the commit might not do it as there won't be any transaction id
> > assigned.
>
> I've not tested it yet but replorigin_advance() with wal_log = true
> seems to work for this case.
>

IIUC, the changes corresponding to above in the latest patch are as follows:

--- a/src/backend/replication/logical/origin.c
+++ b/src/backend/replication/logical/origin.c
@@ -921,7 +921,8 @@ replorigin_advance(RepOriginId node,
  LWLockAcquire(&replication_state->lock, LW_EXCLUSIVE);

  /* Make sure it's not used by somebody else */
- if (replication_state->acquired_by != 0)
+ if (replication_state->acquired_by != 0 &&
+ replication_state->acquired_by != MyProcPid)
  {
...

clear_subscription_skip_xid()
{
..
+ else if (!XLogRecPtrIsInvalid(origin_lsn))
+ {
+ /*
+ * User has already changed subskipxid before clearing the subskipxid, so
+ * don't change the catalog but just advance the replication origin.
+ */
+ replorigin_advance(replorigin_session_origin, origin_lsn,
+    GetXLogInsertRecPtr(),
+    false, /* go_backward */
+    true /* wal_log */);
+ }
..
}

I was thinking what if we don't advance origin explicitly in this
case? Actually, that will be no different than the transactions where
the apply worker doesn't apply any change because the initial sync is
in progress (see should_apply_changes_for_rel()) or we have received
an empty transaction. In those cases also, the origin lsn won't be
advanced even though we acknowledge the advanced last_received
location because of keep_alive messages. Now, it is possible after the
restart we send the old start_lsn location because the replication
origin was not updated before restart but we handle that case in the
server by starting from the last confirmed location. See below code:

CreateDecodingContext()
{
..
else if (start_lsn < slot->data.confirmed_flush)
..

Few other comments on the latest patch:
=================================
1.
A conflict will produce an error and will stop the replication; it must be
    resolved manually by the user.  Details about the conflict can be found in
-   the subscriber's server log.
+   <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the
+   subscriber's server log.

Can we slightly change the modified line to: "Details about the
conflict can be found in <xref
linkend="monitoring-pg-stat-subscription-workers"/> and the
subscriber's server log."? I think we can commit this change
separately as this is true even without this patch.

2.
    The resolution can be done either by changing data on the subscriber so
-   that it does not conflict with the incoming change or by skipping the
-   transaction that conflicts with the existing data.  The transaction can be
-   skipped by calling the <link linkend="pg-replication-origin-advance">
+   that it does not conflict with the incoming changes or by skipping the whole
+   transaction.  This option specifies the ID of the transaction whose
+   application is to be skipped by the logical replication worker.  The logical
+   replication worker skips all data modification transaction conflicts with
+   the existing data. When a conflict produce an error, it is shown in
+   <structname>pg_stat_subscription_workers</structname> view as follows:

I don't think most of the additional text added in the above paragraph
is required. We can rephrase it as: "The resolution can be done either
by changing data on the subscriber so that it does not conflict with
the incoming change or by skipping the transaction that conflicts with
the existing data. When a conflict produces an error, it is shown in
<structname>pg_stat_subscription_workers</structname> view as
follows:". After that keep the text, you have.

3.
They skip the whole transaction, including changes that may not violate any
+   constraint.  They may easily make the subscriber inconsistent, especially if
+   a user specifies the wrong transaction ID or the position of origin.

Can we slightly reword the above text as: "Skipping the whole
transaction includes skipping the changes that may not violate any
constraint.  This can easily make the subscriber inconsistent,
especially if a user specifies the wrong transaction ID or the
position of origin."?

4.
The logical replication worker skips all data
+      modification changes within the specified transaction.  Therefore, since
+      it skips the whole transaction including the changes that may not violate
+      the constraint, it should only be used as a last resort. This option has
+      no effect for the transaction that is already prepared with enabling
+      <literal>two_phase</literal> on susbscriber.

Let's slightly reword the above text as: "The logical replication
worker skips all data modification changes within the specified
transaction including the changes that may not violate the constraint,
so, it should only be used as a last resort. This option has no effect
on the transaction that is already prepared by enabling
<literal>two_phase</literal> on the subscriber."

5.
+          by the logical replication worker. Setting
<literal>NONE</literal> means
+          to reset the transaction ID.

Let's slightly reword the second part of the sentence as: "Setting
<literal>NONE</literal> resets the transaction ID."

6.
Once we start skipping
+ * changes, we don't stop it until the we skip all changes of the
transaction even
+ * if the subscription invalidated and MySubscription->skipxid gets
changed or reset.

/subscription invalidated/subscription is invalidated

What do you mean by subscription invalidated and how is it related to
this feature? I think we should mention something on these lines in
the docs as well.

7.
"Please refer to the comments in these functions for details.". We can
slightly modify this part of the comment as: "Please refer to the
comments in corresponding functions for details."

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

11 января 2022 г., 05:21:57

On Mon, Jan 10, 2022 at 2:57 PM vignesh C <vignesh21@gmail.com> wrote:
>
> 2) Can we have an option to specify last_error_xid of
> pg_stat_subscription_workers. Something like:
> alter subscription sub1 skip ( XID = 'last_subscription_error');
>
> When the user specified last_subscription_error, it should pick
> last_error_xid from pg_stat_subscription_workers.
> As this operation is a critical operation, if there is an option which
> could automatically pick and set from pg_stat_subscription_workers, it
> would be useful.
>

I think having some automatic functionality around this would be good
but I am not so sure about this idea because it is possible that the
error has not reached the stats collector and the user might be
referring to server logs to set the skip xid. In such cases, even
though an error would have occurred but we won't be able to set the
required xid. Now, one can imagine that if we don't get the required
value from pg_stat_subscription_workers then we can return an error to
the user indicating that she can cross-verify the server logs and set
the appropriate xid value but IMO it could be confusing. I feel even
if we want some automatic functionality like you are proposing or
something else, it could be done as a separate patch but let's wait
and see what Sawada-San or others think about this?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

11 января 2022 г., 05:27:34

On Tue, Jan 11, 2022 at 7:52 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jan 10, 2022 at 2:57 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > 2) Can we have an option to specify last_error_xid of
> > pg_stat_subscription_workers. Something like:
> > alter subscription sub1 skip ( XID = 'last_subscription_error');
> >
> > When the user specified last_subscription_error, it should pick
> > last_error_xid from pg_stat_subscription_workers.
> > As this operation is a critical operation, if there is an option which
> > could automatically pick and set from pg_stat_subscription_workers, it
> > would be useful.
> >
>
> I think having some automatic functionality around this would be good
> but I am not so sure about this idea because it is possible that the
> error has not reached the stats collector and the user might be
> referring to server logs to set the skip xid. In such cases, even
> though an error would have occurred but we won't be able to set the
> required xid. Now, one can imagine that if we don't get the required
> value from pg_stat_subscription_workers then we can return an error to
> the user indicating that she can cross-verify the server logs and set
> the appropriate xid value but IMO it could be confusing. I feel even
> if we want some automatic functionality like you are proposing or
> something else, it could be done as a separate patch but let's wait
> and see what Sawada-San or others think about this?

If we are ok with the suggested idea then it can be done as a separate
patch, I agree that it need not be part of the existing patch.

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

11 января 2022 г., 06:22:08

On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Dec 16, 2021 at 11:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > >
> > > > So if skip_xid is already changed, the apply worker would do
> > > > replorigin_advance() with WAL logging, instead of committing the
> > > > catalog change?
> > > >
> > >
> > > Right. BTW, how are you planning to advance the origin? Normally, a
> > > commit transaction would do it but when we are skipping all changes,
> > > the commit might not do it as there won't be any transaction id
> > > assigned.
> >
> > I've not tested it yet but replorigin_advance() with wal_log = true
> > seems to work for this case.
> >
>
> IIUC, the changes corresponding to above in the latest patch are as follows:
>
> --- a/src/backend/replication/logical/origin.c
> +++ b/src/backend/replication/logical/origin.c
> @@ -921,7 +921,8 @@ replorigin_advance(RepOriginId node,
>   LWLockAcquire(&replication_state->lock, LW_EXCLUSIVE);
>
>   /* Make sure it's not used by somebody else */
> - if (replication_state->acquired_by != 0)
> + if (replication_state->acquired_by != 0 &&
> + replication_state->acquired_by != MyProcPid)
>   {
> ...
>
> clear_subscription_skip_xid()
> {
> ..
> + else if (!XLogRecPtrIsInvalid(origin_lsn))
> + {
> + /*
> + * User has already changed subskipxid before clearing the subskipxid, so
> + * don't change the catalog but just advance the replication origin.
> + */
> + replorigin_advance(replorigin_session_origin, origin_lsn,
> +    GetXLogInsertRecPtr(),
> +    false, /* go_backward */
> +    true /* wal_log */);
> + }
> ..
> }
>
> I was thinking what if we don't advance origin explicitly in this
> case? Actually, that will be no different than the transactions where
> the apply worker doesn't apply any change because the initial sync is
> in progress (see should_apply_changes_for_rel()) or we have received
> an empty transaction. In those cases also, the origin lsn won't be
> advanced even though we acknowledge the advanced last_received
> location because of keep_alive messages. Now, it is possible after the
> restart we send the old start_lsn location because the replication
> origin was not updated before restart but we handle that case in the
> server by starting from the last confirmed location. See below code:
>
> CreateDecodingContext()
> {
> ..
> else if (start_lsn < slot->data.confirmed_flush)
> ..

Good point. Probably one minor thing that is different from the
transaction where the apply worker applied an empty transaction is a
case where the server restarts/crashes before sending an
acknowledgment of the flush location. That is, in the case of the
empty transaction, the publisher sends an empty transaction again. On
the other hand in the case of skipping the transaction, a non-empty
transaction will be sent again but skip_xid is already changed or
cleared, therefore the user will have to specify skip_xid again. If we
write replication origin WAL record to advance the origin lsn, it
reduces the possibility of that. But I think it’s a very minor case so
we won’t need to deal with that.

Anyway, according to your analysis, I think we don't necessarily need
to do replorigin_advance() in this case.

>
> Few other comments on the latest patch:
> =================================
> 1.
> A conflict will produce an error and will stop the replication; it must be
>     resolved manually by the user.  Details about the conflict can be found in
> -   the subscriber's server log.
> +   <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the
> +   subscriber's server log.
>
> Can we slightly change the modified line to: "Details about the
> conflict can be found in <xref
> linkend="monitoring-pg-stat-subscription-workers"/> and the
> subscriber's server log."?

Will fix it.

>  I think we can commit this change
> separately as this is true even without this patch.

Right. It seems an oversight of 8d74fc96db. I've attached the patch.

>
> 2.
>     The resolution can be done either by changing data on the subscriber so
> -   that it does not conflict with the incoming change or by skipping the
> -   transaction that conflicts with the existing data.  The transaction can be
> -   skipped by calling the <link linkend="pg-replication-origin-advance">
> +   that it does not conflict with the incoming changes or by skipping the whole
> +   transaction.  This option specifies the ID of the transaction whose
> +   application is to be skipped by the logical replication worker.  The logical
> +   replication worker skips all data modification transaction conflicts with
> +   the existing data. When a conflict produce an error, it is shown in
> +   <structname>pg_stat_subscription_workers</structname> view as follows:
>
> I don't think most of the additional text added in the above paragraph
> is required. We can rephrase it as: "The resolution can be done either
> by changing data on the subscriber so that it does not conflict with
> the incoming change or by skipping the transaction that conflicts with
> the existing data. When a conflict produces an error, it is shown in
> <structname>pg_stat_subscription_workers</structname> view as
> follows:". After that keep the text, you have.

Agreed, will fix.

>
> 3.
> They skip the whole transaction, including changes that may not violate any
> +   constraint.  They may easily make the subscriber inconsistent, especially if
> +   a user specifies the wrong transaction ID or the position of origin.
>
> Can we slightly reword the above text as: "Skipping the whole
> transaction includes skipping the changes that may not violate any
> constraint.  This can easily make the subscriber inconsistent,
> especially if a user specifies the wrong transaction ID or the
> position of origin."?

Will fix.

>
> 4.
> The logical replication worker skips all data
> +      modification changes within the specified transaction.  Therefore, since
> +      it skips the whole transaction including the changes that may not violate
> +      the constraint, it should only be used as a last resort. This option has
> +      no effect for the transaction that is already prepared with enabling
> +      <literal>two_phase</literal> on susbscriber.
>
> Let's slightly reword the above text as: "The logical replication
> worker skips all data modification changes within the specified
> transaction including the changes that may not violate the constraint,
> so, it should only be used as a last resort. This option has no effect
> on the transaction that is already prepared by enabling
> <literal>two_phase</literal> on the subscriber."

Will fix.

>
> 5.
> +          by the logical replication worker. Setting
> <literal>NONE</literal> means
> +          to reset the transaction ID.
>
> Let's slightly reword the second part of the sentence as: "Setting
> <literal>NONE</literal> resets the transaction ID."

Will fix.

>
> 6.
> Once we start skipping
> + * changes, we don't stop it until the we skip all changes of the
> transaction even
> + * if the subscription invalidated and MySubscription->skipxid gets
> changed or reset.
>
> /subscription invalidated/subscription is invalidated

Will fix.

>
> What do you mean by subscription invalidated and how is it related to
> this feature? I think we should mention something on these lines in
> the docs as well.

I meant that MySubscription, a cache of pg_subscription entry, is
invalidated by the catalog change. IIUC while applying changes we
don't re-read pg_subscription (i.e., not calling
maybe_reread_subscription()). Similarly, while skipping changes, we
also don't do that. Therefore, even if skip_xid has been changed while
skipping changes, we don't stop skipping changes.

>
> 7.
> "Please refer to the comments in these functions for details.". We can
> slightly modify this part of the comment as: "Please refer to the
> comments in corresponding functions for details."

Will fix.

I'll submit an updated patch soon.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

doc_update.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

11 января 2022 г., 09:01:23

On Tue, Jan 11, 2022 at 11:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jan 10, 2022 at 2:57 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > 2) Can we have an option to specify last_error_xid of
> > pg_stat_subscription_workers. Something like:
> > alter subscription sub1 skip ( XID = 'last_subscription_error');
> >
> > When the user specified last_subscription_error, it should pick
> > last_error_xid from pg_stat_subscription_workers.
> > As this operation is a critical operation, if there is an option which
> > could automatically pick and set from pg_stat_subscription_workers, it
> > would be useful.
> >
>
> I think having some automatic functionality around this would be good
> but I am not so sure about this idea because it is possible that the
> error has not reached the stats collector and the user might be
> referring to server logs to set the skip xid. In such cases, even
> though an error would have occurred but we won't be able to set the
> required xid. Now, one can imagine that if we don't get the required
> value from pg_stat_subscription_workers then we can return an error to
> the user indicating that she can cross-verify the server logs and set
> the appropriate xid value but IMO it could be confusing. I feel even
> if we want some automatic functionality like you are proposing or
> something else, it could be done as a separate patch but let's wait
> and see what Sawada-San or others think about this?

Agreed. The automatically setting XID would be a good idea but we can
do that in a separate patch so we can keep the first patch simple.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

11 января 2022 г., 09:12:01

On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I was thinking what if we don't advance origin explicitly in this
> > case? Actually, that will be no different than the transactions where
> > the apply worker doesn't apply any change because the initial sync is
> > in progress (see should_apply_changes_for_rel()) or we have received
> > an empty transaction. In those cases also, the origin lsn won't be
> > advanced even though we acknowledge the advanced last_received
> > location because of keep_alive messages. Now, it is possible after the
> > restart we send the old start_lsn location because the replication
> > origin was not updated before restart but we handle that case in the
> > server by starting from the last confirmed location. See below code:
> >
> > CreateDecodingContext()
> > {
> > ..
> > else if (start_lsn < slot->data.confirmed_flush)
> > ..
>
> Good point. Probably one minor thing that is different from the
> transaction where the apply worker applied an empty transaction is a
> case where the server restarts/crashes before sending an
> acknowledgment of the flush location. That is, in the case of the
> empty transaction, the publisher sends an empty transaction again. On
> the other hand in the case of skipping the transaction, a non-empty
> transaction will be sent again but skip_xid is already changed or
> cleared, therefore the user will have to specify skip_xid again. If we
> write replication origin WAL record to advance the origin lsn, it
> reduces the possibility of that. But I think it’s a very minor case so
> we won’t need to deal with that.
>

Yeah, in the worst case, it will lead to conflict again and the user
needs to set the xid again.

> Anyway, according to your analysis, I think we don't necessarily need
> to do replorigin_advance() in this case.
>

Right.

> >
> > 5.
> > +          by the logical replication worker. Setting
> > <literal>NONE</literal> means
> > +          to reset the transaction ID.
> >
> > Let's slightly reword the second part of the sentence as: "Setting
> > <literal>NONE</literal> resets the transaction ID."
>
> Will fix.
>
> >
> > 6.
> > Once we start skipping
> > + * changes, we don't stop it until the we skip all changes of the
> > transaction even
> > + * if the subscription invalidated and MySubscription->skipxid gets
> > changed or reset.
> >
> > /subscription invalidated/subscription is invalidated
>
> Will fix.
>
> >
> > What do you mean by subscription invalidated and how is it related to
> > this feature? I think we should mention something on these lines in
> > the docs as well.
>
> I meant that MySubscription, a cache of pg_subscription entry, is
> invalidated by the catalog change. IIUC while applying changes we
> don't re-read pg_subscription (i.e., not calling
> maybe_reread_subscription()). Similarly, while skipping changes, we
> also don't do that. Therefore, even if skip_xid has been changed while
> skipping changes, we don't stop skipping changes.
>

Okay, but I don't think we need to mention subscription is invalidated
as that could be confusing, the other part of the comment is quite
clear.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

11 января 2022 г., 11:20:39

On Tue, Jan 11, 2022 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > I was thinking what if we don't advance origin explicitly in this
> > > case? Actually, that will be no different than the transactions where
> > > the apply worker doesn't apply any change because the initial sync is
> > > in progress (see should_apply_changes_for_rel()) or we have received
> > > an empty transaction. In those cases also, the origin lsn won't be
> > > advanced even though we acknowledge the advanced last_received
> > > location because of keep_alive messages. Now, it is possible after the
> > > restart we send the old start_lsn location because the replication
> > > origin was not updated before restart but we handle that case in the
> > > server by starting from the last confirmed location. See below code:
> > >
> > > CreateDecodingContext()
> > > {
> > > ..
> > > else if (start_lsn < slot->data.confirmed_flush)
> > > ..
> >
> > Good point. Probably one minor thing that is different from the
> > transaction where the apply worker applied an empty transaction is a
> > case where the server restarts/crashes before sending an
> > acknowledgment of the flush location. That is, in the case of the
> > empty transaction, the publisher sends an empty transaction again. On
> > the other hand in the case of skipping the transaction, a non-empty
> > transaction will be sent again but skip_xid is already changed or
> > cleared, therefore the user will have to specify skip_xid again. If we
> > write replication origin WAL record to advance the origin lsn, it
> > reduces the possibility of that. But I think it’s a very minor case so
> > we won’t need to deal with that.
> >
>
> Yeah, in the worst case, it will lead to conflict again and the user
> needs to set the xid again.

On second thought, the same is true for other cases, for example,
preparing the transaction and clearing skip_xid while handling a
prepare message. That is, currently we don't clear skip_xid while
handling a prepare message but do that while handling commit/rollback
prepared message, in order to avoid the worst case. If we do both
while handling a prepare message and the server crashes between them,
it ends up that skip_xid is cleared and the transaction will be
resent, which is identical to the worst-case above. Therefore, if we
accept this situation because of its low probability, probably we can
do the same things for other cases too, which makes the patch simple
especially for prepare and commit/rollback-prepared cases. What do you
think?


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

11 января 2022 г., 13:08:04

On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jan 11, 2022 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > I was thinking what if we don't advance origin explicitly in this
> > > > case? Actually, that will be no different than the transactions where
> > > > the apply worker doesn't apply any change because the initial sync is
> > > > in progress (see should_apply_changes_for_rel()) or we have received
> > > > an empty transaction. In those cases also, the origin lsn won't be
> > > > advanced even though we acknowledge the advanced last_received
> > > > location because of keep_alive messages. Now, it is possible after the
> > > > restart we send the old start_lsn location because the replication
> > > > origin was not updated before restart but we handle that case in the
> > > > server by starting from the last confirmed location. See below code:
> > > >
> > > > CreateDecodingContext()
> > > > {
> > > > ..
> > > > else if (start_lsn < slot->data.confirmed_flush)
> > > > ..
> > >
> > > Good point. Probably one minor thing that is different from the
> > > transaction where the apply worker applied an empty transaction is a
> > > case where the server restarts/crashes before sending an
> > > acknowledgment of the flush location. That is, in the case of the
> > > empty transaction, the publisher sends an empty transaction again. On
> > > the other hand in the case of skipping the transaction, a non-empty
> > > transaction will be sent again but skip_xid is already changed or
> > > cleared, therefore the user will have to specify skip_xid again. If we
> > > write replication origin WAL record to advance the origin lsn, it
> > > reduces the possibility of that. But I think it’s a very minor case so
> > > we won’t need to deal with that.
> > >
> >
> > Yeah, in the worst case, it will lead to conflict again and the user
> > needs to set the xid again.
>
> On second thought, the same is true for other cases, for example,
> preparing the transaction and clearing skip_xid while handling a
> prepare message. That is, currently we don't clear skip_xid while
> handling a prepare message but do that while handling commit/rollback
> prepared message, in order to avoid the worst case. If we do both
> while handling a prepare message and the server crashes between them,
> it ends up that skip_xid is cleared and the transaction will be
> resent, which is identical to the worst-case above.
>

How are you thinking to update the skip xid before prepare? If we do
it in the same transaction then the changes in the catalog will be
part of the prepared xact but won't be committed. Now, say if we do it
after prepare, then the situation won't be the same because after
restart the same xact won't appear again.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

11 января 2022 г., 13:10:59

On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Few other comments on the latest patch:
> > =================================
> > 1.
> > A conflict will produce an error and will stop the replication; it must be
> >     resolved manually by the user.  Details about the conflict can be found in
> > -   the subscriber's server log.
> > +   <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the
> > +   subscriber's server log.
> >
> > Can we slightly change the modified line to: "Details about the
> > conflict can be found in <xref
> > linkend="monitoring-pg-stat-subscription-workers"/> and the
> > subscriber's server log."?
>
> Will fix it.
>
> >  I think we can commit this change
> > separately as this is true even without this patch.
>
> Right. It seems an oversight of 8d74fc96db. I've attached the patch.
>

Pushed.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

12 января 2022 г., 03:19:07

On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jan 11, 2022 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > I was thinking what if we don't advance origin explicitly in this
> > > > > case? Actually, that will be no different than the transactions where
> > > > > the apply worker doesn't apply any change because the initial sync is
> > > > > in progress (see should_apply_changes_for_rel()) or we have received
> > > > > an empty transaction. In those cases also, the origin lsn won't be
> > > > > advanced even though we acknowledge the advanced last_received
> > > > > location because of keep_alive messages. Now, it is possible after the
> > > > > restart we send the old start_lsn location because the replication
> > > > > origin was not updated before restart but we handle that case in the
> > > > > server by starting from the last confirmed location. See below code:
> > > > >
> > > > > CreateDecodingContext()
> > > > > {
> > > > > ..
> > > > > else if (start_lsn < slot->data.confirmed_flush)
> > > > > ..
> > > >
> > > > Good point. Probably one minor thing that is different from the
> > > > transaction where the apply worker applied an empty transaction is a
> > > > case where the server restarts/crashes before sending an
> > > > acknowledgment of the flush location. That is, in the case of the
> > > > empty transaction, the publisher sends an empty transaction again. On
> > > > the other hand in the case of skipping the transaction, a non-empty
> > > > transaction will be sent again but skip_xid is already changed or
> > > > cleared, therefore the user will have to specify skip_xid again. If we
> > > > write replication origin WAL record to advance the origin lsn, it
> > > > reduces the possibility of that. But I think it’s a very minor case so
> > > > we won’t need to deal with that.
> > > >
> > >
> > > Yeah, in the worst case, it will lead to conflict again and the user
> > > needs to set the xid again.
> >
> > On second thought, the same is true for other cases, for example,
> > preparing the transaction and clearing skip_xid while handling a
> > prepare message. That is, currently we don't clear skip_xid while
> > handling a prepare message but do that while handling commit/rollback
> > prepared message, in order to avoid the worst case. If we do both
> > while handling a prepare message and the server crashes between them,
> > it ends up that skip_xid is cleared and the transaction will be
> > resent, which is identical to the worst-case above.
> >
>
> How are you thinking to update the skip xid before prepare? If we do
> it in the same transaction then the changes in the catalog will be
> part of the prepared xact but won't be committed. Now, say if we do it
> after prepare, then the situation won't be the same because after
> restart the same xact won't appear again.

I was thinking to commit the catalog change first in a separate
transaction while not updating origin LSN and then prepare an empty
transaction while updating origin LSN. If the server crashes between
them, the skip_xid is cleared but the transaction will be resent.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

12 января 2022 г., 03:19:27

On Tue, Jan 11, 2022 at 7:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Few other comments on the latest patch:
> > > =================================
> > > 1.
> > > A conflict will produce an error and will stop the replication; it must be
> > >     resolved manually by the user.  Details about the conflict can be found in
> > > -   the subscriber's server log.
> > > +   <xref linkend="monitoring-pg-stat-subscription-workers"/> as well as the
> > > +   subscriber's server log.
> > >
> > > Can we slightly change the modified line to: "Details about the
> > > conflict can be found in <xref
> > > linkend="monitoring-pg-stat-subscription-workers"/> and the
> > > subscriber's server log."?
> >
> > Will fix it.
> >
> > >  I think we can commit this change
> > > separately as this is true even without this patch.
> >
> > Right. It seems an oversight of 8d74fc96db. I've attached the patch.
> >
>
> Pushed.

Thanks!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

12 января 2022 г., 06:21:24

On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On second thought, the same is true for other cases, for example,
> > > preparing the transaction and clearing skip_xid while handling a
> > > prepare message. That is, currently we don't clear skip_xid while
> > > handling a prepare message but do that while handling commit/rollback
> > > prepared message, in order to avoid the worst case. If we do both
> > > while handling a prepare message and the server crashes between them,
> > > it ends up that skip_xid is cleared and the transaction will be
> > > resent, which is identical to the worst-case above.
> > >
> >
> > How are you thinking to update the skip xid before prepare? If we do
> > it in the same transaction then the changes in the catalog will be
> > part of the prepared xact but won't be committed. Now, say if we do it
> > after prepare, then the situation won't be the same because after
> > restart the same xact won't appear again.
>
> I was thinking to commit the catalog change first in a separate
> transaction while not updating origin LSN and then prepare an empty
> transaction while updating origin LSN.
>

But, won't it complicate the handling if in the future we try to
enhance this API such that it skips partial changes like skipping only
for particular relation(s) or particular operations as discussed
previously in this thread?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

12 января 2022 г., 09:02:19

On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On second thought, the same is true for other cases, for example,
> > > > preparing the transaction and clearing skip_xid while handling a
> > > > prepare message. That is, currently we don't clear skip_xid while
> > > > handling a prepare message but do that while handling commit/rollback
> > > > prepared message, in order to avoid the worst case. If we do both
> > > > while handling a prepare message and the server crashes between them,
> > > > it ends up that skip_xid is cleared and the transaction will be
> > > > resent, which is identical to the worst-case above.
> > > >
> > >
> > > How are you thinking to update the skip xid before prepare? If we do
> > > it in the same transaction then the changes in the catalog will be
> > > part of the prepared xact but won't be committed. Now, say if we do it
> > > after prepare, then the situation won't be the same because after
> > > restart the same xact won't appear again.
> >
> > I was thinking to commit the catalog change first in a separate
> > transaction while not updating origin LSN and then prepare an empty
> > transaction while updating origin LSN.
> >
>
> But, won't it complicate the handling if in the future we try to
> enhance this API such that it skips partial changes like skipping only
> for particular relation(s) or particular operations as discussed
> previously in this thread?

Right. I was thinking that if we accept the situation that the user
has to set skip_xid again in case of the server crashes, we might be
able to accept also the situation that the user has to clear skip_xid
in a case of the server crashes. But it seems the former is less
problematic.

I've attached an updated patch that incorporated all comments I got so far.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v3-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

12 января 2022 г., 09:03:43

On Mon, Jan 10, 2022 at 6:27 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Jan 7, 2022 at 11:23 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Jan 7, 2022 at 10:04 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Jan 5, 2022 at 12:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Dec 27, 2021 at 9:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Thu, Dec 16, 2021 at 2:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Dec 16, 2021 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > > On Thu, Dec 16, 2021 at 10:37 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Dec 16, 2021 at 11:43 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > I thought we just want to lock before clearing the skip_xid something
> > > > > > > > > like take the lock, check if the skip_xid in the catalog is the same
> > > > > > > > > as we have skipped, if it is the same then clear it, otherwise, leave
> > > > > > > > > it as it is. How will that disallow users to change skip_xid when we
> > > > > > > > > are skipping changes?
> > > > > > > >
> > > > > > > > Oh I thought we wanted to keep holding the lock while skipping changes
> > > > > > > > (changing skip_xid requires acquiring the lock).
> > > > > > > >
> > > > > > > > So if skip_xid is already changed, the apply worker would do
> > > > > > > > replorigin_advance() with WAL logging, instead of committing the
> > > > > > > > catalog change?
> > > > > > > >
> > > > > > >
> > > > > > > Right. BTW, how are you planning to advance the origin? Normally, a
> > > > > > > commit transaction would do it but when we are skipping all changes,
> > > > > > > the commit might not do it as there won't be any transaction id
> > > > > > > assigned.
> > > > > >
> > > > > > I've not tested it yet but replorigin_advance() with wal_log = true
> > > > > > seems to work for this case.
> > > > >
> > > > > I've tested it and realized that we cannot use replorigin_advance()
> > > > > for this purpose without changes. That is, the current
> > > > > replorigin_advance() doesn't allow to advance the origin by the owner:
> > > > >
> > > > >         /* Make sure it's not used by somebody else */
> > > > >         if (replication_state->acquired_by != 0)
> > > > >         {
> > > > >             ereport(ERROR,
> > > > >                     (errcode(ERRCODE_OBJECT_IN_USE),
> > > > >                      errmsg("replication origin with OID %d is already
> > > > > active for PID %d",
> > > > >                             replication_state->roident,
> > > > >                             replication_state->acquired_by)));
> > > > >         }
> > > > >
> > > > > So we need to change it so that the origin owner can advance its
> > > > > origin, which makes sense to me.
> > > > >
> > > > > Also, when we have to update the origin instead of committing the
> > > > > catalog change while updating the origin, we cannot record the origin
> > > > > timestamp.
> > > > >
> > > >
> > > > Is it because we currently update the origin timestamp with commit record?
> > >
> > > Yes.
> > >
> > > >
> > > > > This behavior makes sense to me because we skipped the
> > > > > transaction. But ISTM it’s not good if we emit the origin timestamp
> > > > > only when directly updating the origin. So probably we need to always
> > > > > omit origin timestamp.
> > > > >
> > > >
> > > > Do you mean to say that you want to omit it even when we are
> > > > committing the changes?
> > >
> > > Yes, it would be better to record only origin lsn in terms of consistency.
> > >
> > > >
> > > > > Apart from that, I'm vaguely concerned that the logic seems to be
> > > > > getting complex. Probably it comes from the fact that we store
> > > > > skip_xid in the catalog and update the catalog to clear/set the
> > > > > skip_xid. It might be worth revisiting the idea of storing skip_xid on
> > > > > shmem (e.g., ReplicationState)?
> > > > >
> > > >
> > > > IIRC, the problem with that idea was that we won't remember skip_xid
> > > > information after server restart and the user won't even know that it
> > > > has to set it again.
> > >
> > > Right, I agree that it’s not convenient when the server restarts or
> > > crashes, but these problems could not be critical in the situation
> > > where users have to use this feature; the subscriber already entered
> > > an error loop so they can know xid again and it’s an uncommon case
> > > that they need to restart during skipping changes.
> > >
> > > Anyway, I'll submit an updated patch soon so we can discuss complexity
> > > vs. convenience.
> >
> > Attached an updated patch. Please review it.

Thank you for the comments!

>
> Thanks for the updated patch, few comments:
> 1) Should this be case insensitive to support NONE too:
> +                       /* Setting xid = NONE is treated as resetting xid */
> +                       if (strcmp(xid_str, "none") == 0)
> +                               xid = InvalidTransactionId;

I think the string value is always small cases so we don't need to do
strcacsecmp here.

>
> 2) Can we have an option to specify last_error_xid of
> pg_stat_subscription_workers. Something like:
> alter subscription sub1 skip ( XID = 'last_subscription_error');
>
> When the user specified last_subscription_error, it should pick
> last_error_xid from pg_stat_subscription_workers.
> As this operation is a critical operation, if there is an option which
> could automatically pick and set from pg_stat_subscription_workers, it
> would be useful.

As I mentioned before in another mail, I think we can do that in a
separate patch.

>
> 3) Currently the following syntax is being supported, I felt this
> should throw an error:
> postgres=# alter subscription sub1 set ( XID = 100);
> ALTER SUBSCRIPTION

Fixed.

>
> 4) You might need to rebase the patch:
> git am v2-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch
> Applying: Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on
> subscriber nodes
> error: patch failed: doc/src/sgml/logical-replication.sgml:333
> error: doc/src/sgml/logical-replication.sgml: patch does not apply
> Patch failed at 0001 Add ALTER SUBSCRIPTION ... SKIP to skip the
> transaction on subscriber nodes
> hint: Use 'git am --show-current-patch=diff' to see the failed patch
>
> 5) You might have to rename 027_skip_xact to 028_skip_xact as
> 027_nosuperuser.pl already exists
> diff --git a/src/test/subscription/t/027_skip_xact.pl
> b/src/test/subscription/t/027_skip_xact.pl
> new file mode 100644
> index 0000000000..a63c9c345e
> --- /dev/null
> +++ b/src/test/subscription/t/027_skip_xact.pl

I've resolved these conflicts.

These comments are incorporated into the latest v3 patch I just submitted[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoD9JXah2V8uFURUpZbK_ewsut%2Bjb1ESm6YQkrhQm3nJRg%40mail.gmail.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

12 января 2022 г., 17:10:42

On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On second thought, the same is true for other cases, for example,
> > > > > preparing the transaction and clearing skip_xid while handling a
> > > > > prepare message. That is, currently we don't clear skip_xid while
> > > > > handling a prepare message but do that while handling commit/rollback
> > > > > prepared message, in order to avoid the worst case. If we do both
> > > > > while handling a prepare message and the server crashes between them,
> > > > > it ends up that skip_xid is cleared and the transaction will be
> > > > > resent, which is identical to the worst-case above.
> > > > >
> > > >
> > > > How are you thinking to update the skip xid before prepare? If we do
> > > > it in the same transaction then the changes in the catalog will be
> > > > part of the prepared xact but won't be committed. Now, say if we do it
> > > > after prepare, then the situation won't be the same because after
> > > > restart the same xact won't appear again.
> > >
> > > I was thinking to commit the catalog change first in a separate
> > > transaction while not updating origin LSN and then prepare an empty
> > > transaction while updating origin LSN.
> > >
> >
> > But, won't it complicate the handling if in the future we try to
> > enhance this API such that it skips partial changes like skipping only
> > for particular relation(s) or particular operations as discussed
> > previously in this thread?
>
> Right. I was thinking that if we accept the situation that the user
> has to set skip_xid again in case of the server crashes, we might be
> able to accept also the situation that the user has to clear skip_xid
> in a case of the server crashes. But it seems the former is less
> problematic.
>
> I've attached an updated patch that incorporated all comments I got so far.

Thanks for the updated patch, few comments:
1) Currently skip xid is not displayed in describe subscriptions, can
we include it too:
\dRs+  sub1
                                                        List of subscriptions
 Name |  Owner  | Enabled | Publication | Binary | Streaming | Two
phase commit | Synchronous commit |            Conninfo

------+---------+---------+-------------+--------+-----------+------------------+--------------------+--------------------------------
 sub1 | vignesh | t       | {pub1}      | f      | f         | e
         | off                | dbname=postgres host=localhost
(1 row)

2) This import "use PostgreSQL::Test::Utils;" is not required:
+# Tests for skipping logical replication transactions.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 6;

3) Some of the comments uses a punctuation mark and some of them does
not use, Should we keep it consistent:
+    # Wait for worker error
+    $node_subscriber->poll_query_until(
+       'postgres',

+    # Set skip xid
+    $node_subscriber->safe_psql(
+       'postgres',

+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');


+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');

4) Should this be changed:
+ * True if we are skipping all data modification changes (INSERT,
UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid.  Once we
start skipping
+ * changes, we don't stop it until the we skip all changes of the
transaction even
+ * if pg_subscription is updated that and MySubscription->skipxid
gets changed or
to:
+ * True if we are skipping all data modification changes (INSERT,
UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid.  Once we
start skipping
+ * changes, we don't stop it until we skip all changes of the transaction even
+ * if pg_subscription is updated that and MySubscription->skipxid
gets changed or

In "stop it until the we skip all changes", here the is not required.

Regards,
Vignesh

RE: Skipping logical replication transactions on subscriber side

От

"tanghy.fnst@fujitsu.com"

Дата:

13 января 2022 г., 04:07:49

On Wed, Jan 12, 2022 2:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> I've attached an updated patch that incorporated all comments I got so far.
> 

Thanks for updating the patch. Here are some comments:

1)
+      Skip applying changes of the particular transaction.  If incoming data

Should "Skip" be "Skips" ?

2)
+      prepared by enabling <literal>two_phase</literal> on susbscriber.  After h
+      the logical replication successfully skips the transaction, the transaction

The "h" after word "After" seems redundant.

3)
+   Skipping the whole transaction includes skipping the cahnge that may not violate

"cahnge" should be "changes" I think.

4)
+/*
+ * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid.  Once we start skipping
...
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))

Maybe we should modify this comment. Something like:
skipping_xid is valid if we are skipping all data modification changes ...

5)
+                    if (!superuser())
+                        ereport(ERROR,
+                                (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+                                 errmsg("must be superuser to set %s", "skip_xid")));

Should we change the message to "must be superuser to skip xid"?
Because the SQL stmt is "ALTER SUBSCRIPTION ... SKIP (xid = XXX)".

Regards,
Tang

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

14 января 2022 г., 05:19:10

On Wed, Jan 12, 2022 at 11:10 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On second thought, the same is true for other cases, for example,
> > > > > > preparing the transaction and clearing skip_xid while handling a
> > > > > > prepare message. That is, currently we don't clear skip_xid while
> > > > > > handling a prepare message but do that while handling commit/rollback
> > > > > > prepared message, in order to avoid the worst case. If we do both
> > > > > > while handling a prepare message and the server crashes between them,
> > > > > > it ends up that skip_xid is cleared and the transaction will be
> > > > > > resent, which is identical to the worst-case above.
> > > > > >
> > > > >
> > > > > How are you thinking to update the skip xid before prepare? If we do
> > > > > it in the same transaction then the changes in the catalog will be
> > > > > part of the prepared xact but won't be committed. Now, say if we do it
> > > > > after prepare, then the situation won't be the same because after
> > > > > restart the same xact won't appear again.
> > > >
> > > > I was thinking to commit the catalog change first in a separate
> > > > transaction while not updating origin LSN and then prepare an empty
> > > > transaction while updating origin LSN.
> > > >
> > >
> > > But, won't it complicate the handling if in the future we try to
> > > enhance this API such that it skips partial changes like skipping only
> > > for particular relation(s) or particular operations as discussed
> > > previously in this thread?
> >
> > Right. I was thinking that if we accept the situation that the user
> > has to set skip_xid again in case of the server crashes, we might be
> > able to accept also the situation that the user has to clear skip_xid
> > in a case of the server crashes. But it seems the former is less
> > problematic.
> >
> > I've attached an updated patch that incorporated all comments I got so far.
>
> Thanks for the updated patch, few comments:

Thank you for the comments!

> 1) Currently skip xid is not displayed in describe subscriptions, can
> we include it too:
> \dRs+  sub1
>                                                         List of subscriptions
>  Name |  Owner  | Enabled | Publication | Binary | Streaming | Two
> phase commit | Synchronous commit |            Conninfo
>
------+---------+---------+-------------+--------+-----------+------------------+--------------------+--------------------------------
>  sub1 | vignesh | t       | {pub1}      | f      | f         | e
>          | off                | dbname=postgres host=localhost
> (1 row)
>
> 2) This import "use PostgreSQL::Test::Utils;" is not required:
> +# Tests for skipping logical replication transactions.
> +use strict;
> +use warnings;
> +use PostgreSQL::Test::Cluster;
> +use PostgreSQL::Test::Utils;
> +use Test::More tests => 6;
>
> 3) Some of the comments uses a punctuation mark and some of them does
> not use, Should we keep it consistent:
> +    # Wait for worker error
> +    $node_subscriber->poll_query_until(
> +       'postgres',
>
> +    # Set skip xid
> +    $node_subscriber->safe_psql(
> +       'postgres',
>
> +# Create publisher node.
> +my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
> +$node_publisher->init(allows_streaming => 'logical');
>
>
> +# Create subscriber node.
> +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
>
> 4) Should this be changed:
> + * True if we are skipping all data modification changes (INSERT,
> UPDATE, etc.) of
> + * the specified transaction at MySubscription->skipxid.  Once we
> start skipping
> + * changes, we don't stop it until the we skip all changes of the
> transaction even
> + * if pg_subscription is updated that and MySubscription->skipxid
> gets changed or
> to:
> + * True if we are skipping all data modification changes (INSERT,
> UPDATE, etc.) of
> + * the specified transaction at MySubscription->skipxid.  Once we
> start skipping
> + * changes, we don't stop it until we skip all changes of the transaction even
> + * if pg_subscription is updated that and MySubscription->skipxid
> gets changed or
>
> In "stop it until the we skip all changes", here the is not required.
>

I agree with all the comments above. I've attached an updated patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v4-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

14 января 2022 г., 05:25:02

On Thu, Jan 13, 2022 at 10:07 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Wed, Jan 12, 2022 2:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated patch that incorporated all comments I got so far.
> >
>
> Thanks for updating the patch. Here are some comments:

Thank you for the comments!

>
> 1)
> +      Skip applying changes of the particular transaction.  If incoming data
>
> Should "Skip" be "Skips" ?
>
> 2)
> +      prepared by enabling <literal>two_phase</literal> on susbscriber.  After h
> +      the logical replication successfully skips the transaction, the transaction
>
> The "h" after word "After" seems redundant.
>
> 3)
> +   Skipping the whole transaction includes skipping the cahnge that may not violate
>
> "cahnge" should be "changes" I think.
>
> 4)
> +/*
> + * True if we are skipping all data modification changes (INSERT, UPDATE, etc.) of
> + * the specified transaction at MySubscription->skipxid.  Once we start skipping
> ...
> + */
> +static TransactionId skipping_xid = InvalidTransactionId;
> +#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))
>
> Maybe we should modify this comment. Something like:
> skipping_xid is valid if we are skipping all data modification changes ...
>
> 5)
> +                                       if (!superuser())
> +                                               ereport(ERROR,
> +                                                               (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                                                                errmsg("must be superuser to set %s",
"skip_xid")));
>
> Should we change the message to "must be superuser to skip xid"?
> Because the SQL stmt is "ALTER SUBSCRIPTION ... SKIP (xid = XXX)".

I agree with all the comments above. These are incorporated into the
latest v4 patch I've just submitted[1].

Regards,

[1] postgresql.org/message-id/CAD21AoBZC87nY1pCaexk1uBA68JSBmy2-UqLGirT9g-RVMhjKw%40mail.gmail.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

14 января 2022 г., 15:05:25

On Fri, Jan 14, 2022 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 12, 2022 at 11:10 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > >
> > > > > > > On second thought, the same is true for other cases, for example,
> > > > > > > preparing the transaction and clearing skip_xid while handling a
> > > > > > > prepare message. That is, currently we don't clear skip_xid while
> > > > > > > handling a prepare message but do that while handling commit/rollback
> > > > > > > prepared message, in order to avoid the worst case. If we do both
> > > > > > > while handling a prepare message and the server crashes between them,
> > > > > > > it ends up that skip_xid is cleared and the transaction will be
> > > > > > > resent, which is identical to the worst-case above.
> > > > > > >
> > > > > >
> > > > > > How are you thinking to update the skip xid before prepare? If we do
> > > > > > it in the same transaction then the changes in the catalog will be
> > > > > > part of the prepared xact but won't be committed. Now, say if we do it
> > > > > > after prepare, then the situation won't be the same because after
> > > > > > restart the same xact won't appear again.
> > > > >
> > > > > I was thinking to commit the catalog change first in a separate
> > > > > transaction while not updating origin LSN and then prepare an empty
> > > > > transaction while updating origin LSN.
> > > > >
> > > >
> > > > But, won't it complicate the handling if in the future we try to
> > > > enhance this API such that it skips partial changes like skipping only
> > > > for particular relation(s) or particular operations as discussed
> > > > previously in this thread?
> > >
> > > Right. I was thinking that if we accept the situation that the user
> > > has to set skip_xid again in case of the server crashes, we might be
> > > able to accept also the situation that the user has to clear skip_xid
> > > in a case of the server crashes. But it seems the former is less
> > > problematic.
> > >
> > > I've attached an updated patch that incorporated all comments I got so far.
> >
> > Thanks for the updated patch, few comments:
>
> Thank you for the comments!
>
> > 1) Currently skip xid is not displayed in describe subscriptions, can
> > we include it too:
> > \dRs+  sub1
> >                                                         List of subscriptions
> >  Name |  Owner  | Enabled | Publication | Binary | Streaming | Two
> > phase commit | Synchronous commit |            Conninfo
> >
------+---------+---------+-------------+--------+-----------+------------------+--------------------+--------------------------------
> >  sub1 | vignesh | t       | {pub1}      | f      | f         | e
> >          | off                | dbname=postgres host=localhost
> > (1 row)
> >
> > 2) This import "use PostgreSQL::Test::Utils;" is not required:
> > +# Tests for skipping logical replication transactions.
> > +use strict;
> > +use warnings;
> > +use PostgreSQL::Test::Cluster;
> > +use PostgreSQL::Test::Utils;
> > +use Test::More tests => 6;
> >
> > 3) Some of the comments uses a punctuation mark and some of them does
> > not use, Should we keep it consistent:
> > +    # Wait for worker error
> > +    $node_subscriber->poll_query_until(
> > +       'postgres',
> >
> > +    # Set skip xid
> > +    $node_subscriber->safe_psql(
> > +       'postgres',
> >
> > +# Create publisher node.
> > +my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
> > +$node_publisher->init(allows_streaming => 'logical');
> >
> >
> > +# Create subscriber node.
> > +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
> >
> > 4) Should this be changed:
> > + * True if we are skipping all data modification changes (INSERT,
> > UPDATE, etc.) of
> > + * the specified transaction at MySubscription->skipxid.  Once we
> > start skipping
> > + * changes, we don't stop it until the we skip all changes of the
> > transaction even
> > + * if pg_subscription is updated that and MySubscription->skipxid
> > gets changed or
> > to:
> > + * True if we are skipping all data modification changes (INSERT,
> > UPDATE, etc.) of
> > + * the specified transaction at MySubscription->skipxid.  Once we
> > start skipping
> > + * changes, we don't stop it until we skip all changes of the transaction even
> > + * if pg_subscription is updated that and MySubscription->skipxid
> > gets changed or
> >
> > In "stop it until the we skip all changes", here the is not required.
> >
>
> I agree with all the comments above. I've attached an updated patch.

Thanks for the updated patch, few minor comments:
1) Should "SKIP" be "SKIP (" here:
@@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
        /* ALTER SUBSCRIPTION <name> */
        else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
                COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
-                                         "RENAME TO", "REFRESH
PUBLICATION", "SET",
+                                         "RENAME TO", "REFRESH
PUBLICATION", "SET", "SKIP",

2) We could add a test for this if possible:
+               case ALTER_SUBSCRIPTION_SKIP:
+                       {
+                               if (!superuser())
+                                       ereport(ERROR,
+
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+                                                        errmsg("must
be superuser to skip transaction")));

3) There was one typo in commit message, transaciton shoudl be transaction:
After skipping the transaciton the apply worker clears
pg_subscription.subskipxid.

Another small typo, susbscriber should be subscriber:
+      prepared by enabling <literal>two_phase</literal> on susbscriber.  After
+      the logical replication successfully skips the transaction, the
transaction

4) Should skipsubxid be mentioned as subskipxid here:
+      * Clear the subskipxid of pg_subscription catalog.  This catalog
+      * update must be committed before finishing prepared transaction.
+      * Because otherwise, in a case where the server crashes between
+      * finishing prepared transaction and the catalog update, COMMIT
+      * PREPARED won’t be resent but skipsubxid is left.

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

15 января 2022 г., 13:23:49

On Fri, Jan 14, 2022 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I agree with all the comments above. I've attached an updated patch.
>

Review comments
================
1.
+
+  <para>
+   In this case, you need to consider changing the data on the
subscriber so that it

The starting of this sentence doesn't make sense to me. How about
changing it like: "To resolve conflicts, you need to ...

2.
+      <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
+      is cleared.  See <xref linkend="logical-replication-conflicts"/> for
+      the details of logical replication conflicts.
+     </para>
+
+     <para>
+      <replaceable>skip_option</replaceable> specifies options for
this operation.
+      The supported option is:
+
+      <variablelist>
+       <varlistentry>
+        <term><literal>xid</literal> (<type>xid</type>)</term>
+        <listitem>
+         <para>
+          Specifies the ID of the transaction whose changes are to be skipped
+          by the logical replication worker. Setting
<literal>NONE</literal> resets
+          the transaction ID.
+         </para>

Empty spaces after line finish are inconsistent. I personally use a
single space before a new line but I see that others use two spaces
and the nearby documentation also uses two spaces in this regard so I
am fine either way but let's be consistent.

3.
+ case ALTER_SUBSCRIPTION_SKIP:
+ {
+ if (!superuser())
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to skip transaction")));
+
+ parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
+
+ if (IsSet(opts.specified_opts, SUBOPT_XID))
..
..

Is there a case when the above 'if (IsSet(..' won't be true? If not,
then probably there should be Assert instead of 'if'.

4.
+static TransactionId skipping_xid = InvalidTransactionId;

I find this variable name bit odd. Can we name it skip_xid?

5.
+ * skipping_xid is valid if we are skipping all data modification changes
+ * (INSERT, UPDATE, etc.) of the specified transaction at
MySubscription->skipxid.
+ * Once we start skipping changes, we don't stop it until we skip all changes

I think it would be better to write the first line of comment as: "We
enable skipping all data modification changes (INSERT, UPDATE, etc.)
for the subscription if the user has specified skip_xid. Once we ..."

6.
+static void
+maybe_start_skipping_changes(TransactionId xid)
+{
+ Assert(!is_skipping_changes());
+ Assert(!in_remote_transaction);
+ Assert(!in_streamed_transaction);
+
+ /* Make sure subscription cache is up-to-date */
+ maybe_reread_subscription();

Why do we need to update the cache here by calling
maybe_reread_subscription() and at other places in the patch? It is
sufficient to get the skip_xid value at the start of the worker via
GetSubscription().

7. In maybe_reread_subscription(), isn't there a need to check whether
skip_xid is changed where we exit and launch the worker and compare
other subscription parameters?

8.
+static void
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
+{
+ Relation rel;
+ Form_pg_subscription subform;
+ HeapTuple tup;
+ bool nulls[Natts_pg_subscription];
+ bool replaces[Natts_pg_subscription];
+ Datum values[Natts_pg_subscription];
+
+ memset(values, 0, sizeof(values));
+ memset(nulls, false, sizeof(nulls));
+ memset(replaces, false, sizeof(replaces));
+
+ if (!IsTransactionState())
+ StartTransactionCommand();
+
+ LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
+ AccessShareLock);

It is important to add a comment as to why we need a lock here.

9.
+ * needs to be set subskipxid again.  We can reduce the possibility by
+ * logging a replication origin WAL record to advance the origin LSN
+ * instead but it doesn't seem to be worth since it's a very minor case.

You can also add here that there is no way to advance origin_timestamp
so that would be inconsistent.

10.
+clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
+ TimestampTz origin_timestamp)
{
..
..
+ if (!IsTransactionState())
+ StartTransactionCommand();
..
..
+ CommitTransactionCommand();
..
}

The transaction should be committed in this function if it is started
here otherwise it should be the responsibility of the caller to commit
it.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

15 января 2022 г., 13:28:21

On Fri, Jan 14, 2022 at 5:35 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Thanks for the updated patch, few minor comments:
> 1) Should "SKIP" be "SKIP (" here:
> @@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
>         /* ALTER SUBSCRIPTION <name> */
>         else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
>                 COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
> -                                         "RENAME TO", "REFRESH
> PUBLICATION", "SET",
> +                                         "RENAME TO", "REFRESH
> PUBLICATION", "SET", "SKIP",
>

Won't the another rule as follows added by patch sufficient for what
you are asking?
+ /* ALTER SUBSCRIPTION <name> SKIP */
+ else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
+ COMPLETE_WITH("(");

I might be missing something but why do you think the handling of SKIP
be any different than what we are doing for SET?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 января 2022 г., 07:18:49

On Sat, Jan 15, 2022 at 7:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 14, 2022 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I agree with all the comments above. I've attached an updated patch.
> >
>
> Review comments
> ================

Thank you for the comments!

> 1.
> +
> +  <para>
> +   In this case, you need to consider changing the data on the
> subscriber so that it
>
> The starting of this sentence doesn't make sense to me. How about
> changing it like: "To resolve conflicts, you need to ...
>

Fixed.

> 2.
> +      <structname>pg_subscription</structname>.<structfield>subskipxid</structfield>)
> +      is cleared.  See <xref linkend="logical-replication-conflicts"/> for
> +      the details of logical replication conflicts.
> +     </para>
> +
> +     <para>
> +      <replaceable>skip_option</replaceable> specifies options for
> this operation.
> +      The supported option is:
> +
> +      <variablelist>
> +       <varlistentry>
> +        <term><literal>xid</literal> (<type>xid</type>)</term>
> +        <listitem>
> +         <para>
> +          Specifies the ID of the transaction whose changes are to be skipped
> +          by the logical replication worker. Setting
> <literal>NONE</literal> resets
> +          the transaction ID.
> +         </para>
>
> Empty spaces after line finish are inconsistent. I personally use a
> single space before a new line but I see that others use two spaces
> and the nearby documentation also uses two spaces in this regard so I
> am fine either way but let's be consistent.

Fixed.

>
> 3.
> + case ALTER_SUBSCRIPTION_SKIP:
> + {
> + if (!superuser())
> + ereport(ERROR,
> + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> + errmsg("must be superuser to skip transaction")));
> +
> + parse_subscription_options(pstate, stmt->options, SUBOPT_XID, &opts);
> +
> + if (IsSet(opts.specified_opts, SUBOPT_XID))
> ..
> ..
>
> Is there a case when the above 'if (IsSet(..' won't be true? If not,
> then probably there should be Assert instead of 'if'.
>

Fixed.

> 4.
> +static TransactionId skipping_xid = InvalidTransactionId;
>
> I find this variable name bit odd. Can we name it skip_xid?
>

Okay, renamed.

> 5.
> + * skipping_xid is valid if we are skipping all data modification changes
> + * (INSERT, UPDATE, etc.) of the specified transaction at
> MySubscription->skipxid.
> + * Once we start skipping changes, we don't stop it until we skip all changes
>
> I think it would be better to write the first line of comment as: "We
> enable skipping all data modification changes (INSERT, UPDATE, etc.)
> for the subscription if the user has specified skip_xid. Once we ..."
>

Changed.

> 6.
> +static void
> +maybe_start_skipping_changes(TransactionId xid)
> +{
> + Assert(!is_skipping_changes());
> + Assert(!in_remote_transaction);
> + Assert(!in_streamed_transaction);
> +
> + /* Make sure subscription cache is up-to-date */
> + maybe_reread_subscription();
>
> Why do we need to update the cache here by calling
> maybe_reread_subscription() and at other places in the patch? It is
> sufficient to get the skip_xid value at the start of the worker via
> GetSubscription().

MySubscription could be out-of-date after a user changes the catalog.
In non-skipping change cases, we check it when starting the
transaction in begin_replication_step() which is called, e.g., when
applying an insert change. But I think we need to make sure it’s
up-to-date at the beginning of applying changes, that is, before
starting a transaction. Otherwise, we may end up skipping the
transaction based on out-of-dated subscription cache.

The reason why calling calling maybe_reread_subscription in both
apply_handle_commit_prepared() and apply_handle_rollback_prepared() is
the same; MySubscription could be out-of-date when applying
commit-prepared or rollback-prepared since we have not called
begin_replication_step() to open a new transaction.

>
> 7. In maybe_reread_subscription(), isn't there a need to check whether
> skip_xid is changed where we exit and launch the worker and compare
> other subscription parameters?

IIUC we relaunch the worker here when subscription parameters such as
slot_name was changed. In the current implementation, I think that
relaunching the worker is not necessarily necessary when skip_xid is
changed. For instance, when skipping the prepared transaction, we
deliberately don’t clear subskipxid of pg_subscription and do that at
commit-prepared or rollback-prepared case. There are chances that the
user changes skip_xid before commit-prepared or rollback-prepared. But
we tolerate this case.

Also, in non-streaming and non-2PC cases, while skipping changes we
don’t call maybe_reread_subscription() until all changes are skipped.
So it cannot work to cancel skipping changes that is already started.

>
> 8.
> +static void
> +clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
> + TimestampTz origin_timestamp)
> +{
> + Relation rel;
> + Form_pg_subscription subform;
> + HeapTuple tup;
> + bool nulls[Natts_pg_subscription];
> + bool replaces[Natts_pg_subscription];
> + Datum values[Natts_pg_subscription];
> +
> + memset(values, 0, sizeof(values));
> + memset(nulls, false, sizeof(nulls));
> + memset(replaces, false, sizeof(replaces));
> +
> + if (!IsTransactionState())
> + StartTransactionCommand();
> +
> + LockSharedObject(SubscriptionRelationId, MySubscription->oid, 0,
> + AccessShareLock);
>
> It is important to add a comment as to why we need a lock here.

Added.

>
> 9.
> + * needs to be set subskipxid again.  We can reduce the possibility by
> + * logging a replication origin WAL record to advance the origin LSN
> + * instead but it doesn't seem to be worth since it's a very minor case.
>
> You can also add here that there is no way to advance origin_timestamp
> so that would be inconsistent.

Added.

>
> 10.
> +clear_subscription_skip_xid(TransactionId xid, XLogRecPtr origin_lsn,
> + TimestampTz origin_timestamp)
> {
> ..
> ..
> + if (!IsTransactionState())
> + StartTransactionCommand();
> ..
> ..
> + CommitTransactionCommand();
> ..
> }
>
> The transaction should be committed in this function if it is started
> here otherwise it should be the responsibility of the caller to commit
> it.

Fixed.

I've attached an updated patch that incorporated these comments except
for 6 and 7 that we probably need more discussion on. The comments
from Vignesh are also incorporated.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v5-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 января 2022 г., 07:20:19

On Fri, Jan 14, 2022 at 9:05 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, Jan 14, 2022 at 7:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jan 12, 2022 at 11:10 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > > On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > > >
> > > > > > > > On second thought, the same is true for other cases, for example,
> > > > > > > > preparing the transaction and clearing skip_xid while handling a
> > > > > > > > prepare message. That is, currently we don't clear skip_xid while
> > > > > > > > handling a prepare message but do that while handling commit/rollback
> > > > > > > > prepared message, in order to avoid the worst case. If we do both
> > > > > > > > while handling a prepare message and the server crashes between them,
> > > > > > > > it ends up that skip_xid is cleared and the transaction will be
> > > > > > > > resent, which is identical to the worst-case above.
> > > > > > > >
> > > > > > >
> > > > > > > How are you thinking to update the skip xid before prepare? If we do
> > > > > > > it in the same transaction then the changes in the catalog will be
> > > > > > > part of the prepared xact but won't be committed. Now, say if we do it
> > > > > > > after prepare, then the situation won't be the same because after
> > > > > > > restart the same xact won't appear again.
> > > > > >
> > > > > > I was thinking to commit the catalog change first in a separate
> > > > > > transaction while not updating origin LSN and then prepare an empty
> > > > > > transaction while updating origin LSN.
> > > > > >
> > > > >
> > > > > But, won't it complicate the handling if in the future we try to
> > > > > enhance this API such that it skips partial changes like skipping only
> > > > > for particular relation(s) or particular operations as discussed
> > > > > previously in this thread?
> > > >
> > > > Right. I was thinking that if we accept the situation that the user
> > > > has to set skip_xid again in case of the server crashes, we might be
> > > > able to accept also the situation that the user has to clear skip_xid
> > > > in a case of the server crashes. But it seems the former is less
> > > > problematic.
> > > >
> > > > I've attached an updated patch that incorporated all comments I got so far.
> > >
> > > Thanks for the updated patch, few comments:
> >
> > Thank you for the comments!
> >
> > > 1) Currently skip xid is not displayed in describe subscriptions, can
> > > we include it too:
> > > \dRs+  sub1
> > >                                                         List of subscriptions
> > >  Name |  Owner  | Enabled | Publication | Binary | Streaming | Two
> > > phase commit | Synchronous commit |            Conninfo
> > >
------+---------+---------+-------------+--------+-----------+------------------+--------------------+--------------------------------
> > >  sub1 | vignesh | t       | {pub1}      | f      | f         | e
> > >          | off                | dbname=postgres host=localhost
> > > (1 row)
> > >
> > > 2) This import "use PostgreSQL::Test::Utils;" is not required:
> > > +# Tests for skipping logical replication transactions.
> > > +use strict;
> > > +use warnings;
> > > +use PostgreSQL::Test::Cluster;
> > > +use PostgreSQL::Test::Utils;
> > > +use Test::More tests => 6;
> > >
> > > 3) Some of the comments uses a punctuation mark and some of them does
> > > not use, Should we keep it consistent:
> > > +    # Wait for worker error
> > > +    $node_subscriber->poll_query_until(
> > > +       'postgres',
> > >
> > > +    # Set skip xid
> > > +    $node_subscriber->safe_psql(
> > > +       'postgres',
> > >
> > > +# Create publisher node.
> > > +my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
> > > +$node_publisher->init(allows_streaming => 'logical');
> > >
> > >
> > > +# Create subscriber node.
> > > +my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
> > >
> > > 4) Should this be changed:
> > > + * True if we are skipping all data modification changes (INSERT,
> > > UPDATE, etc.) of
> > > + * the specified transaction at MySubscription->skipxid.  Once we
> > > start skipping
> > > + * changes, we don't stop it until the we skip all changes of the
> > > transaction even
> > > + * if pg_subscription is updated that and MySubscription->skipxid
> > > gets changed or
> > > to:
> > > + * True if we are skipping all data modification changes (INSERT,
> > > UPDATE, etc.) of
> > > + * the specified transaction at MySubscription->skipxid.  Once we
> > > start skipping
> > > + * changes, we don't stop it until we skip all changes of the transaction even
> > > + * if pg_subscription is updated that and MySubscription->skipxid
> > > gets changed or
> > >
> > > In "stop it until the we skip all changes", here the is not required.
> > >
> >
> > I agree with all the comments above. I've attached an updated patch.
>
> Thanks for the updated patch, few minor comments:

Thank you for the comments.

> 1) Should "SKIP" be "SKIP (" here:
> @@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
>         /* ALTER SUBSCRIPTION <name> */
>         else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
>                 COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
> -                                         "RENAME TO", "REFRESH
> PUBLICATION", "SET",
> +                                         "RENAME TO", "REFRESH
> PUBLICATION", "SET", "SKIP",

As Amit mentioned, it's consistent with the SET option.

>
> 2) We could add a test for this if possible:
> +               case ALTER_SUBSCRIPTION_SKIP:
> +                       {
> +                               if (!superuser())
> +                                       ereport(ERROR,
> +
> (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +                                                        errmsg("must
> be superuser to skip transaction")));
>
> 3) There was one typo in commit message, transaciton shoudl be transaction:
> After skipping the transaciton the apply worker clears
> pg_subscription.subskipxid.
>
> Another small typo, susbscriber should be subscriber:
> +      prepared by enabling <literal>two_phase</literal> on susbscriber.  After
> +      the logical replication successfully skips the transaction, the
> transaction
>
> 4) Should skipsubxid be mentioned as subskipxid here:
> +      * Clear the subskipxid of pg_subscription catalog.  This catalog
> +      * update must be committed before finishing prepared transaction.
> +      * Because otherwise, in a case where the server crashes between
> +      * finishing prepared transaction and the catalog update, COMMIT
> +      * PREPARED won’t be resent but skipsubxid is left.
>

The above comments were incorporated into the latest v5 patch I just
submitted[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoCd3Y2-b67%2BpVrzrdteUmup1XG6JeHYOa5dGjh8qZ3VuQ%40mail.gmail.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

17 января 2022 г., 08:47:58

On Mon, Jan 17, 2022 at 9:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Jan 15, 2022 at 7:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > 6.
> > +static void
> > +maybe_start_skipping_changes(TransactionId xid)
> > +{
> > + Assert(!is_skipping_changes());
> > + Assert(!in_remote_transaction);
> > + Assert(!in_streamed_transaction);
> > +
> > + /* Make sure subscription cache is up-to-date */
> > + maybe_reread_subscription();
> >
> > Why do we need to update the cache here by calling
> > maybe_reread_subscription() and at other places in the patch? It is
> > sufficient to get the skip_xid value at the start of the worker via
> > GetSubscription().
>
> MySubscription could be out-of-date after a user changes the catalog.
> In non-skipping change cases, we check it when starting the
> transaction in begin_replication_step() which is called, e.g., when
> applying an insert change. But I think we need to make sure it’s
> up-to-date at the beginning of applying changes, that is, before
> starting a transaction. Otherwise, we may end up skipping the
> transaction based on out-of-dated subscription cache.
>

I thought the user would normally set skip_xid only after an error
which means that the value should be as new as the time of the start
of the worker. I am slightly worried about the cost we might need to
pay for this additional look-up in case skip_xid is not changed. Do
you see any valid user scenario where we might not see the required
skip_xid? I am okay with calling this if we really need it.

> >
> > 7. In maybe_reread_subscription(), isn't there a need to check whether
> > skip_xid is changed where we exit and launch the worker and compare
> > other subscription parameters?
>
> IIUC we relaunch the worker here when subscription parameters such as
> slot_name was changed. In the current implementation, I think that
> relaunching the worker is not necessarily necessary when skip_xid is
> changed. For instance, when skipping the prepared transaction, we
> deliberately don’t clear subskipxid of pg_subscription and do that at
> commit-prepared or rollback-prepared case. There are chances that the
> user changes skip_xid before commit-prepared or rollback-prepared. But
> we tolerate this case.
>

I think between prepare and commit prepared, the user only needs to
change it if there is another error in which case we will anyway
restart and load the new value of same. But, I understand that we
don't need to restart if skip_xid is changed as it might not impact
remote connection in any way, so I am fine for not doing anything for
this.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 января 2022 г., 09:18:01

On Mon, Jan 17, 2022 at 2:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jan 17, 2022 at 9:49 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, Jan 15, 2022 at 7:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > 6.
> > > +static void
> > > +maybe_start_skipping_changes(TransactionId xid)
> > > +{
> > > + Assert(!is_skipping_changes());
> > > + Assert(!in_remote_transaction);
> > > + Assert(!in_streamed_transaction);
> > > +
> > > + /* Make sure subscription cache is up-to-date */
> > > + maybe_reread_subscription();
> > >
> > > Why do we need to update the cache here by calling
> > > maybe_reread_subscription() and at other places in the patch? It is
> > > sufficient to get the skip_xid value at the start of the worker via
> > > GetSubscription().
> >
> > MySubscription could be out-of-date after a user changes the catalog.
> > In non-skipping change cases, we check it when starting the
> > transaction in begin_replication_step() which is called, e.g., when
> > applying an insert change. But I think we need to make sure it’s
> > up-to-date at the beginning of applying changes, that is, before
> > starting a transaction. Otherwise, we may end up skipping the
> > transaction based on out-of-dated subscription cache.
> >
>
> I thought the user would normally set skip_xid only after an error
> which means that the value should be as new as the time of the start
> of the worker. I am slightly worried about the cost we might need to
> pay for this additional look-up in case skip_xid is not changed. Do
> you see any valid user scenario where we might not see the required
> skip_xid? I am okay with calling this if we really need it.

Fair point. I've changed the code accordingly.

>
> > >
> > > 7. In maybe_reread_subscription(), isn't there a need to check whether
> > > skip_xid is changed where we exit and launch the worker and compare
> > > other subscription parameters?
> >
> > IIUC we relaunch the worker here when subscription parameters such as
> > slot_name was changed. In the current implementation, I think that
> > relaunching the worker is not necessarily necessary when skip_xid is
> > changed. For instance, when skipping the prepared transaction, we
> > deliberately don’t clear subskipxid of pg_subscription and do that at
> > commit-prepared or rollback-prepared case. There are chances that the
> > user changes skip_xid before commit-prepared or rollback-prepared. But
> > we tolerate this case.
> >
>
> I think between prepare and commit prepared, the user only needs to
> change it if there is another error in which case we will anyway
> restart and load the new value of same. But, I understand that we
> don't need to restart if skip_xid is changed as it might not impact
> remote connection in any way, so I am fine for not doing anything for
> this.

I'll leave this part for now. We can change it later if others think
it's necessary.

I've attached an updated patch. Please review it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v6-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

17 января 2022 г., 11:03:17

On Monday, January 17, 2022 3:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've attached an updated patch. Please review it.
Hi, thank you for sharing a new patch.
Few comments on the v6.

(1) doc/src/sgml/ref/alter_subscription.sgml

+      resort.  This option has no effect on the transaction that is already

One TAB exists between "resort" and "This".

(2) Minor improvement suggestion of comment in src/backend/replication/logical/worker.c

+ * reset during that.  Also, we don't skip receiving the changes in streaming
+ * cases, since we decide whether or not to skip applying the changes when

I sugguest that you don't use 'streaming cases', because
what "streaming cases" means sounds a bit broader than actual your implementation.
We do skip transaction of streaming cases but not during the spooling phase, right ?

I suggest below.

"We don't skip receiving the changes at the phase to spool streaming transactions"

(3) in the comment of apply_handle_prepare_internal, two full-width characters.

3-1
+     * won’t be resent in a case where the server crashes between them.

3-2
+     * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because this

You have full-width characters for "won't" and "that's".
Could you please check ?


(4) typo

+ * the subscription if hte user has specified skip_xid. Once we start skipping

"hte" should "the" ?

(5)

I can miss something here but, in one of
the past discussions, there seems a consensus that
if the user specifies XID of a subtransaction,
it would be better to skip only the subtransaction.

This time, is it out of the range of the patch ?
If so, I suggest you include some description about it
either in the commit message or around codes related to it.

(6)

I feel it's a better idea to include a test whether
to skip aborted streaming transaction clears the XID
in the TAP test for this feature, in a sense to cover
various new code paths. Did you have any special reason
to omit the case ?

(7)

I want more explanation for the reason to restart the subscriber
in the TAP test because this is not mandatory operation.
(We can pass the TAP tests without this restart)

From :
# Restart the subscriber node to restart logical replication with no interval

IIUC, below would be better.

To :
# As an optimization to finish tests earlier, restart the subscriber with no interval,
# rather than waiting for new error to laucher a new apply worker.


Best Regards,
    Takamichi Osumi

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

17 января 2022 г., 15:34:56

On Monday, January 17, 2022 5:03 PM I wrote:
> Hi, thank you for sharing a new patch.
> Few comments on the v6.
> 
> (1) doc/src/sgml/ref/alter_subscription.sgml
> 
> +      resort.  This option has no effect on the transaction that is
> + already
> 
> One TAB exists between "resort" and "This".
> 
> (2) Minor improvement suggestion of comment in
> src/backend/replication/logical/worker.c
> 
> + * reset during that.  Also, we don't skip receiving the changes in
> + streaming
> + * cases, since we decide whether or not to skip applying the changes
> + when
> 
> I sugguest that you don't use 'streaming cases', because what "streaming
> cases" means sounds a bit broader than actual your implementation.
> We do skip transaction of streaming cases but not during the spooling phase,
> right ?
> 
> I suggest below.
> 
> "We don't skip receiving the changes at the phase to spool streaming
> transactions"
> 
> (3) in the comment of apply_handle_prepare_internal, two full-width
> characters.
> 
> 3-1
> +     * won’t be resent in a case where the server crashes between them.
> 
> 3-2
> +     * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay
> because this
> 
> You have full-width characters for "won't" and "that's".
> Could you please check ?
> 
> 
> (4) typo
> 
> + * the subscription if hte user has specified skip_xid. Once we start
> + skipping
> 
> "hte" should "the" ?
> 
> (5)
> 
> I can miss something here but, in one of the past discussions, there seems a
> consensus that if the user specifies XID of a subtransaction, it would be better
> to skip only the subtransaction.
> 
> This time, is it out of the range of the patch ?
> If so, I suggest you include some description about it either in the commit
> message or around codes related to it.
> 
> (6)
> 
> I feel it's a better idea to include a test whether to skip aborted streaming
> transaction clears the XID in the TAP test for this feature, in a sense to cover
> various new code paths. Did you have any special reason to omit the case ?
> 
> (7)
> 
> I want more explanation for the reason to restart the subscriber in the TAP test
> because this is not mandatory operation.
> (We can pass the TAP tests without this restart)
> 
> From :
> # Restart the subscriber node to restart logical replication with no interval
> 
> IIUC, below would be better.
> 
> To :
> # As an optimization to finish tests earlier, restart the subscriber with no
> interval, # rather than waiting for new error to laucher a new apply worker.
Few more minor comments

(8) another full-width char in apply_handle_commit_prepared


+                * PREPARED won't be resent but subskipxid is left.

Kindly check "won't" ?

(9) the header comments of clear_subscription_skip_xid

+/* clear subskipxid of pg_subscription catalog */

Should start with an upper letter ?

(10) some variable declarations and initialization of clear_subscription_skip_xid

There's no harm in moving below codes into a condition case
where the user didn't change the subskipxid before
apply worker clearing it.

+       bool            nulls[Natts_pg_subscription];
+       bool            replaces[Natts_pg_subscription];
+       Datum           values[Natts_pg_subscription];
+
+       memset(values, 0, sizeof(values));
+       memset(nulls, false, sizeof(nulls));
+       memset(replaces, false, sizeof(replaces));


Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 января 2022 г., 15:51:48

On Mon, Jan 17, 2022 at 5:03 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Monday, January 17, 2022 3:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached an updated patch. Please review it.
> Hi, thank you for sharing a new patch.
> Few comments on the v6.

Thank you for the comments!

>
> (1) doc/src/sgml/ref/alter_subscription.sgml
>
> +      resort.  This option has no effect on the transaction that is already
>
> One TAB exists between "resort" and "This".

Will remove.

>
> (2) Minor improvement suggestion of comment in src/backend/replication/logical/worker.c
>
> + * reset during that.  Also, we don't skip receiving the changes in streaming
> + * cases, since we decide whether or not to skip applying the changes when
>
> I sugguest that you don't use 'streaming cases', because
> what "streaming cases" means sounds a bit broader than actual your implementation.
> We do skip transaction of streaming cases but not during the spooling phase, right ?
>
> I suggest below.
>
> "We don't skip receiving the changes at the phase to spool streaming transactions"

I might be missing your point but I think it's correct that we don't
skip receiving the change of the transaction that is sent via
streaming protocol. And it doesn't sound broader to me. Could you
elaborate on that?

>
> (3) in the comment of apply_handle_prepare_internal, two full-width characters.
>
> 3-1
> +        * won’t be resent in a case where the server crashes between them.
>
> 3-2
> +        * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay because this
>
> You have full-width characters for "won't" and "that's".
> Could you please check ?

Which characters in "won't" are full-width characters? I could not find them.

>
>
> (4) typo
>
> + * the subscription if hte user has specified skip_xid. Once we start skipping
>
> "hte" should "the" ?

Will fix.

>
> (5)
>
> I can miss something here but, in one of
> the past discussions, there seems a consensus that
> if the user specifies XID of a subtransaction,
> it would be better to skip only the subtransaction.
>
> This time, is it out of the range of the patch ?
> If so, I suggest you include some description about it
> either in the commit message or around codes related to it.

How can the user know subtransaction XID? I suppose you refer to
streaming protocol cases but while applying spooled changes we don't
report subtransaction XID neither in server log nor
pg_stat_subscription_workers.

>
> (6)
>
> I feel it's a better idea to include a test whether
> to skip aborted streaming transaction clears the XID
> in the TAP test for this feature, in a sense to cover
> various new code paths. Did you have any special reason
> to omit the case ?

Which code path is newly covered by this aborted streaming transaction
tests? I think that this patch is already covered even by the test for
a committed-and-streamed transaction. It doesn't matter whether the
streamed transaction is committed or aborted because an error occurs
while applying the spooled changes.

>
> (7)
>
> I want more explanation for the reason to restart the subscriber
> in the TAP test because this is not mandatory operation.
> (We can pass the TAP tests without this restart)
>
> From :
> # Restart the subscriber node to restart logical replication with no interval
>
> IIUC, below would be better.
>
> To :
> # As an optimization to finish tests earlier, restart the subscriber with no interval,
> # rather than waiting for new error to laucher a new apply worker.

I could not understand why the proposed sentence has more information.
Does it mean you want to mention "As an optimization to finish tests
earlier"?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 января 2022 г., 15:54:59

On Mon, Jan 17, 2022 at 9:35 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Monday, January 17, 2022 5:03 PM I wrote:
> > Hi, thank you for sharing a new patch.
> > Few comments on the v6.
> >
> > (1) doc/src/sgml/ref/alter_subscription.sgml
> >
> > +      resort.  This option has no effect on the transaction that is
> > + already
> >
> > One TAB exists between "resort" and "This".
> >
> > (2) Minor improvement suggestion of comment in
> > src/backend/replication/logical/worker.c
> >
> > + * reset during that.  Also, we don't skip receiving the changes in
> > + streaming
> > + * cases, since we decide whether or not to skip applying the changes
> > + when
> >
> > I sugguest that you don't use 'streaming cases', because what "streaming
> > cases" means sounds a bit broader than actual your implementation.
> > We do skip transaction of streaming cases but not during the spooling phase,
> > right ?
> >
> > I suggest below.
> >
> > "We don't skip receiving the changes at the phase to spool streaming
> > transactions"
> >
> > (3) in the comment of apply_handle_prepare_internal, two full-width
> > characters.
> >
> > 3-1
> > +      * won’t be resent in a case where the server crashes between them.
> >
> > 3-2
> > +      * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay
> > because this
> >
> > You have full-width characters for "won't" and "that's".
> > Could you please check ?
> >
> >
> > (4) typo
> >
> > + * the subscription if hte user has specified skip_xid. Once we start
> > + skipping
> >
> > "hte" should "the" ?
> >
> > (5)
> >
> > I can miss something here but, in one of the past discussions, there seems a
> > consensus that if the user specifies XID of a subtransaction, it would be better
> > to skip only the subtransaction.
> >
> > This time, is it out of the range of the patch ?
> > If so, I suggest you include some description about it either in the commit
> > message or around codes related to it.
> >
> > (6)
> >
> > I feel it's a better idea to include a test whether to skip aborted streaming
> > transaction clears the XID in the TAP test for this feature, in a sense to cover
> > various new code paths. Did you have any special reason to omit the case ?
> >
> > (7)
> >
> > I want more explanation for the reason to restart the subscriber in the TAP test
> > because this is not mandatory operation.
> > (We can pass the TAP tests without this restart)
> >
> > From :
> > # Restart the subscriber node to restart logical replication with no interval
> >
> > IIUC, below would be better.
> >
> > To :
> > # As an optimization to finish tests earlier, restart the subscriber with no
> > interval, # rather than waiting for new error to laucher a new apply worker.
> Few more minor comments

Thank you for the comments!

>
> (8) another full-width char in apply_handle_commit_prepared
>
>
> +                * PREPARED won't be resent but subskipxid is left.
>
> Kindly check "won't" ?

Again, I don't follow what you mean by full-width character in this context.

>
> (9) the header comments of clear_subscription_skip_xid
>
> +/* clear subskipxid of pg_subscription catalog */
>
> Should start with an upper letter ?

Okay, I'll change it.

>
> (10) some variable declarations and initialization of clear_subscription_skip_xid
>
> There's no harm in moving below codes into a condition case
> where the user didn't change the subskipxid before
> apply worker clearing it.
>
> +       bool            nulls[Natts_pg_subscription];
> +       bool            replaces[Natts_pg_subscription];
> +       Datum           values[Natts_pg_subscription];
> +
> +       memset(values, 0, sizeof(values));
> +       memset(nulls, false, sizeof(nulls));
> +       memset(replaces, false, sizeof(replaces));
>

Will move.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

17 января 2022 г., 16:14:58

On Mon, Jan 17, 2022 at 6:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> >
> > (5)
> >
> > I can miss something here but, in one of
> > the past discussions, there seems a consensus that
> > if the user specifies XID of a subtransaction,
> > it would be better to skip only the subtransaction.
> >
> > This time, is it out of the range of the patch ?
> > If so, I suggest you include some description about it
> > either in the commit message or around codes related to it.
>
> How can the user know subtransaction XID? I suppose you refer to
> streaming protocol cases but while applying spooled changes we don't
> report subtransaction XID neither in server log nor
> pg_stat_subscription_workers.
>

I also think in the current system users won't be aware of
subtransaction's XID but I feel Osumi-San's point is valid that we
should at least add it in docs that we allow to skip only top-level
xacts. Also, in the future, it won't be impossible to imagine that we
can have subtransaction's XID info also available to users as we have
that in the case of streaming xacts (See subxact_data).

Few minor points:
===============
1.
+ * the subscription if hte user has specified skip_xid.

Typo. /hte/the

2.
+ * PREPARED wonâ€™t be resent but subskipxid is left.

In diffmerge tool, won't is showing some funny chars. When I manually
removed 't and added it again, everything is fine. I am not sure why
it is so? I think Osumi-San has also raised this complaint.

3.
+ /*
+ * We don't expect that the user set the XID of the transaction that is
+ * rolled back but if the skip XID is set, clear it.
+ */

/user set/user to set/

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

18 января 2022 г., 04:36:21

On Mon, Jan 17, 2022 at 5:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached an updated patch. Please review it.
>

Some review comments for the v6 patch:


doc/src/sgml/logical-replication.sgml

(1) Expanded output

Since the view output is shown in "expanded output" mode, perhaps the
doc should say that, or alternatively add the following lines prior to
it, to make it clear:

  postgres=# \x
  Expanded display is on.


(2) Message output in server log

The actual CONTEXT text now just says "at ..." instead of "with commit
timestamp ...", so the doc needs to be updated as follows:

BEFORE:
+CONTEXT:  processing remote data during "INSERT" for replication
target relation "public.test" in transaction 716 with commit timestamp
2021-09-29 15:52:45.165754+00
AFTER:
+CONTEXT:  processing remote data during "INSERT" for replication
target relation "public.test" in transaction 716 at 2021-09-29
15:52:45.165754+00

(3)
The wording "the change" doesn't seem right here, so I suggest the
following update:

BEFORE:
+   Skipping the whole transaction includes skipping the change that
may not violate
AFTER:
+   Skipping the whole transaction includes skipping changes that may
not violate


doc/src/sgml/ref/alter_subscription.sgml

(4)
I have a number of suggested wording improvements:

BEFORE:
+      Skips applying changes of the particular transaction.  If incoming data
+      violates any constraints the logical replication will stop until it is
+      resolved.  The resolution can be done either by changing data on the
+      subscriber so that it doesn't conflict with incoming change or
by skipping
+      the whole transaction.  The logical replication worker skips all data
+      modification changes within the specified transaction including
the changes
+      that may not violate the constraint, so, it should only be used as a last
+      resort. This option has no effect on the transaction that is already
+      prepared by enabling <literal>two_phase</literal> on subscriber.

AFTER:
+      Skips applying all changes of the specified transaction.  If
incoming data
+      violates any constraints, logical replication will stop until it is
+      resolved.  The resolution can be done either by changing data on the
+      subscriber so that it doesn't conflict with incoming change or
by skipping
+      the whole transaction.  Using the SKIP option, the logical
replication worker skips all data
+      modification changes within the specified transaction, including changes
+      that may not violate the constraint, so, it should only be used as a last
+      resort. This option has no effect on transactions that are already
+      prepared by enabling <literal>two_phase</literal> on the subscriber.


(5)
change -> changes

BEFORE:
+      subscriber so that it doesn't conflict with incoming change or
by skipping
AFTER:
+      subscriber so that it doesn't conflict with incoming changes or
by skipping


src/backend/replication/logical/worker.c

(6) Missing word?
The following should say "worth doing" or "worth it"?

+ * doesn't seem to be worth since it's a very minor case.


src/test/regress/sql/subscription.sql

(7) Misleading test case
I think the following test case is misleading and should be removed,
because the "1.1" xid value is only regarded as invalid because "1" is
an invalid xid (and there's already a test case for a "1" xid) - the
fractional part gets thrown away, and doesn't affect the validity
here.

   +ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 января 2022 г., 05:32:14

On Mon, Jan 17, 2022 at 10:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jan 17, 2022 at 6:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > >
> > > (5)
> > >
> > > I can miss something here but, in one of
> > > the past discussions, there seems a consensus that
> > > if the user specifies XID of a subtransaction,
> > > it would be better to skip only the subtransaction.
> > >
> > > This time, is it out of the range of the patch ?
> > > If so, I suggest you include some description about it
> > > either in the commit message or around codes related to it.
> >
> > How can the user know subtransaction XID? I suppose you refer to
> > streaming protocol cases but while applying spooled changes we don't
> > report subtransaction XID neither in server log nor
> > pg_stat_subscription_workers.
> >
>
> I also think in the current system users won't be aware of
> subtransaction's XID but I feel Osumi-San's point is valid that we
> should at least add it in docs that we allow to skip only top-level
> xacts. Also, in the future, it won't be impossible to imagine that we
> can have subtransaction's XID info also available to users as we have
> that in the case of streaming xacts (See subxact_data).

Fair point and more accurate, but I'm a bit concerned that using these
words could confuse the user. There are some places in the doc where
we use the words “top-level transaction” and "sub transactions” but
these are not commonly used in the doc. The user normally would not be
aware that sub transactions are used to implement SAVEPOINTs. Also,
the publisher's subtransaction ID doesn’t appear anywhere on the
subscriber. So if we want to mention it, I think we should use other
words instead of them but I don’t have a good idea for that. Do you
have any ideas?

>
> Few minor points:
> ===============
> 1.
> + * the subscription if hte user has specified skip_xid.
>
> Typo. /hte/the

Will fix.

>
> 2.
> + * PREPARED wonâ€™t be resent but subskipxid is left.
>
> In diffmerge tool, won't is showing some funny chars. When I manually
> removed 't and added it again, everything is fine. I am not sure why
> it is so? I think Osumi-San has also raised this complaint.

Oh I didn't realize that. I'll check it again by using diffmerge tool.

>
> 3.
> + /*
> + * We don't expect that the user set the XID of the transaction that is
> + * rolled back but if the skip XID is set, clear it.
> + */
>
> /user set/user to set/

Will fix.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 января 2022 г., 05:41:26

On Tue, Jan 18, 2022 at 10:36 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Mon, Jan 17, 2022 at 5:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated patch. Please review it.
> >
>
> Some review comments for the v6 patch:

Thank you for the comments!

>
>
> doc/src/sgml/logical-replication.sgml
>
> (1) Expanded output
>
> Since the view output is shown in "expanded output" mode, perhaps the
> doc should say that, or alternatively add the following lines prior to
> it, to make it clear:
>
>   postgres=# \x
>   Expanded display is on.

I'm not sure it's really necessary. A similar example would be
perform.sgml but it doesn't say "\x".

>
>
> (2) Message output in server log
>
> The actual CONTEXT text now just says "at ..." instead of "with commit
> timestamp ...", so the doc needs to be updated as follows:
>
> BEFORE:
> +CONTEXT:  processing remote data during "INSERT" for replication
> target relation "public.test" in transaction 716 with commit timestamp
> 2021-09-29 15:52:45.165754+00
> AFTER:
> +CONTEXT:  processing remote data during "INSERT" for replication
> target relation "public.test" in transaction 716 at 2021-09-29
> 15:52:45.165754+00

Will fix.

>
> (3)
> The wording "the change" doesn't seem right here, so I suggest the
> following update:
>
> BEFORE:
> +   Skipping the whole transaction includes skipping the change that
> may not violate
> AFTER:
> +   Skipping the whole transaction includes skipping changes that may
> not violate
>
>
> doc/src/sgml/ref/alter_subscription.sgml

Will fix.

>
> (4)
> I have a number of suggested wording improvements:
>
> BEFORE:
> +      Skips applying changes of the particular transaction.  If incoming data
> +      violates any constraints the logical replication will stop until it is
> +      resolved.  The resolution can be done either by changing data on the
> +      subscriber so that it doesn't conflict with incoming change or
> by skipping
> +      the whole transaction.  The logical replication worker skips all data
> +      modification changes within the specified transaction including
> the changes
> +      that may not violate the constraint, so, it should only be used as a last
> +      resort. This option has no effect on the transaction that is already
> +      prepared by enabling <literal>two_phase</literal> on subscriber.
>
> AFTER:
> +      Skips applying all changes of the specified transaction.  If
> incoming data
> +      violates any constraints, logical replication will stop until it is
> +      resolved.  The resolution can be done either by changing data on the
> +      subscriber so that it doesn't conflict with incoming change or
> by skipping
> +      the whole transaction.  Using the SKIP option, the logical
> replication worker skips all data
> +      modification changes within the specified transaction, including changes
> +      that may not violate the constraint, so, it should only be used as a last
> +      resort. This option has no effect on transactions that are already
> +      prepared by enabling <literal>two_phase</literal> on the subscriber.
>

Will fix.

>
> (5)
> change -> changes
>
> BEFORE:
> +      subscriber so that it doesn't conflict with incoming change or
> by skipping
> AFTER:
> +      subscriber so that it doesn't conflict with incoming changes or
> by skipping

Will fix.

>
>
> src/backend/replication/logical/worker.c
>
> (6) Missing word?
> The following should say "worth doing" or "worth it"?
>
> + * doesn't seem to be worth since it's a very minor case.
>

WIll fix

>
> src/test/regress/sql/subscription.sql
>
> (7) Misleading test case
> I think the following test case is misleading and should be removed,
> because the "1.1" xid value is only regarded as invalid because "1" is
> an invalid xid (and there's already a test case for a "1" xid) - the
> fractional part gets thrown away, and doesn't affect the validity
> here.
>
>    +ALTER SUBSCRIPTION regress_testsub SKIP (xid = 1.1);
>

Good point. Will remove.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

18 января 2022 г., 05:52:30

On Tue, Jan 18, 2022 at 8:02 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jan 17, 2022 at 10:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Jan 17, 2022 at 6:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > >
> > > > (5)
> > > >
> > > > I can miss something here but, in one of
> > > > the past discussions, there seems a consensus that
> > > > if the user specifies XID of a subtransaction,
> > > > it would be better to skip only the subtransaction.
> > > >
> > > > This time, is it out of the range of the patch ?
> > > > If so, I suggest you include some description about it
> > > > either in the commit message or around codes related to it.
> > >
> > > How can the user know subtransaction XID? I suppose you refer to
> > > streaming protocol cases but while applying spooled changes we don't
> > > report subtransaction XID neither in server log nor
> > > pg_stat_subscription_workers.
> > >
> >
> > I also think in the current system users won't be aware of
> > subtransaction's XID but I feel Osumi-San's point is valid that we
> > should at least add it in docs that we allow to skip only top-level
> > xacts. Also, in the future, it won't be impossible to imagine that we
> > can have subtransaction's XID info also available to users as we have
> > that in the case of streaming xacts (See subxact_data).
>
> Fair point and more accurate, but I'm a bit concerned that using these
> words could confuse the user. There are some places in the doc where
> we use the words “top-level transaction” and "sub transactions” but
> these are not commonly used in the doc. The user normally would not be
> aware that sub transactions are used to implement SAVEPOINTs. Also,
> the publisher's subtransaction ID doesn’t appear anywhere on the
> subscriber. So if we want to mention it, I think we should use other
> words instead of them but I don’t have a good idea for that. Do you
> have any ideas?
>

How about changing existing text:
+          Specifies the ID of the transaction whose changes are to be skipped
+          by the logical replication worker.  Setting <literal>NONE</literal>
+          resets the transaction ID.

to

Specifies the top-level transaction identifier whose changes are to be
skipped by the logical replication worker.  We don't support skipping
individual subtransactions.  Setting <literal>NONE</literal> resets
the transaction ID.

--
With Regards,
Amit Kapila.

RE: Skipping logical replication transactions on subscriber side

От

"tanghy.fnst@fujitsu.com"

Дата:

18 января 2022 г., 06:04:00

On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> I've attached an updated patch. Please review it.
> 

Thanks for updating the patch. Few comments:

1)
        /* Two_phase is only supported in v15 and higher */
         if (pset.sversion >= 150000)
             appendPQExpBuffer(&buf,
-                              ", subtwophasestate AS \"%s\"\n",
-                              gettext_noop("Two phase commit"));
+                              ", subtwophasestate AS \"%s\"\n"
+                              ", subskipxid AS \"%s\"\n",
+                              gettext_noop("Two phase commit"),
+                              gettext_noop("Skip XID"));
 
         appendPQExpBuffer(&buf,
                           ",  subsynccommit AS \"%s\"\n"

I think "skip xid" should be mentioned in the comment. Maybe it could be changed to:
"Two_phase and skip XID are only supported in v15 and higher"

2) The following two places are not consistent in whether "= value" is surround
with square brackets.

+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>[= <replaceable class="parameter">value</replaceable>] [, ... ] )
 

+    <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable
class="parameter">value</replaceable>[, ... ] )</literal></term>
 

Should we modify the first place to:
+ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>= <replaceable class="parameter">value</replaceable> [, ... ] )
 

Because currently there is only one skip_option - xid, and a parameter must be
specified when using it.

3)
+     * Protect subskip_xid of pg_subscription from being concurrently updated
+     * while clearing it.

"subskip_xid" should be "subskipxid" I think.
 
4)
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */

The option name was "skip_xid" in the previous version, and it is "xid" in
latest patch. So should we modify "skip_xid option" to "skip xid option", or
"skip option xid", or something else?

Also the following place has similar issue:
+ * the subscription if hte user has specified skip_xid. Once we start skipping

Regards,
Tang

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

18 января 2022 г., 06:20:15

On Monday, January 17, 2022 9:52 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> Thank you for the comments!
..
> > (2) Minor improvement suggestion of comment in
> > src/backend/replication/logical/worker.c
> >
> > + * reset during that.  Also, we don't skip receiving the changes in
> > + streaming
> > + * cases, since we decide whether or not to skip applying the changes
> > + when
> >
> > I sugguest that you don't use 'streaming cases', because what
> > "streaming cases" means sounds a bit broader than actual your
> implementation.
> > We do skip transaction of streaming cases but not during the spooling phase,
> right ?
> >
> > I suggest below.
> >
> > "We don't skip receiving the changes at the phase to spool streaming
> transactions"
> 
> I might be missing your point but I think it's correct that we don't skip receiving
> the change of the transaction that is sent via streaming protocol. And it doesn't
> sound broader to me. Could you elaborate on that?
OK. Excuse me for lack of explanation.

I felt "streaming cases" implies "non-streaming cases"
to compare a diffference (in my head) when it is
used to explain something at first.
I imagined the contrast between those, when I saw it.

Thus, I thought "streaming cases" meant
whole flow of streaming transactions which consists of messages
surrounded by stream start and stream stop and which are finished by
stream commit/stream abort (including 2PC variations).

When I come back to the subject, you wrote below in the comment

"we don't skip receiving the changes in streaming cases,
since we decide whether or not to skip applying the changes
when starting to apply changes"

The first part of this sentence
("we don't skip receiving the changes in streaming cases")
gives me an impression where we don't skip changes in the streaming cases
(of my understanding above), but the last part
("we decide whether or not to skip applying the changes
when starting to apply change") means we skip transactions for streaming at apply phase.

So, this sentence looked confusing to me slightly.
Thus, I suggested below (and when I connect it with existing part)

"we don't skip receiving the changes at the phase to spool streaming transactions
since we decide whether or not to skip applying the changes when starting to apply changes"

For me this looked better, but of course, this is a suggestion.

> >
> > (3) in the comment of apply_handle_prepare_internal, two full-width
> characters.
> >
> > 3-1
> > +        * won’t be resent in a case where the server crashes between
> them.
> >
> > 3-2
> > +        * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay
> > + because this
> >
> > You have full-width characters for "won't" and "that's".
> > Could you please check ?
> 
> Which characters in "won't" are full-width characters? I could not find them.
All characters I found and mentioned as full-width are single quotes.

It might be good that you check the entire patch once
by some tool that helps you to detect it.

> > (5)
> >
> > I can miss something here but, in one of the past discussions, there
> > seems a consensus that if the user specifies XID of a subtransaction,
> > it would be better to skip only the subtransaction.
> >
> > This time, is it out of the range of the patch ?
> > If so, I suggest you include some description about it either in the
> > commit message or around codes related to it.
> 
> How can the user know subtransaction XID? I suppose you refer to streaming
> protocol cases but while applying spooled changes we don't report
> subtransaction XID neither in server log nor pg_stat_subscription_workers.
Yeah, usually subtransaction XID is not exposed to the users. I agree.

But, clarifying the target of this feature is only top-level transactions
sounds better to me. Thank you Amit-san for your support
about how we should write it in [1] !

> > (6)
> >
> > I feel it's a better idea to include a test whether to skip aborted
> > streaming transaction clears the XID in the TAP test for this feature,
> > in a sense to cover various new code paths. Did you have any special
> > reason to omit the case ?
> 
> Which code path is newly covered by this aborted streaming transaction tests?
> I think that this patch is already covered even by the test for a
> committed-and-streamed transaction. It doesn't matter whether the streamed
> transaction is committed or aborted because an error occurs while applying the
> spooled changes.
Oh, this was my mistake.  What I expressed as a new patch is
apply_handle_stream_abort -> clear_subscription_skip_xid.
But, this was totally wrong as you explained.


> >
> > (7)
> >
> > I want more explanation for the reason to restart the subscriber in
> > the TAP test because this is not mandatory operation.
> > (We can pass the TAP tests without this restart)
> >
> > From :
> > # Restart the subscriber node to restart logical replication with no
> > interval
> >
> > IIUC, below would be better.
> >
> > To :
> > # As an optimization to finish tests earlier, restart the subscriber
> > with no interval, # rather than waiting for new error to laucher a new apply
> worker.
> 
> I could not understand why the proposed sentence has more information.
> Does it mean you want to mention "As an optimization to finish tests earlier"?
Yes, exactly. The point is to add "As an optimization to finish tests earlier".

Probably, I should have asked a simple question "why do you restart the subscriber" ?
At first sight, I couldn't understand the meaning for the restart and
you don't explain the reason itself.

[1] - https://www.postgresql.org/message-id/CAA4eK1JHUF7fVNHQ1ZRRgVsdE8XDY8BruU9dNP3Q3jizNdpEbg%40mail.gmail.com


Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

18 января 2022 г., 06:37:44

On Tue, Jan 18, 2022 at 8:34 AM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
>
> 2) The following two places are not consistent in whether "= value" is surround
> with square brackets.
>
> +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>[= <replaceable class="parameter">value</replaceable>] [, ... ] )
 
>
> +    <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable
class="parameter">value</replaceable>[, ... ] )</literal></term>
 
>
> Should we modify the first place to:
> +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>= <replaceable class="parameter">value</replaceable> [, ... ] )
 
>
> Because currently there is only one skip_option - xid, and a parameter must be
> specified when using it.
>

Good observation. Do we really need [, ... ] as currently, we support
only one value for XID?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 января 2022 г., 06:50:46

On Tue, Jan 18, 2022 at 12:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 18, 2022 at 8:34 AM tanghy.fnst@fujitsu.com
> <tanghy.fnst@fujitsu.com> wrote:
> >
> > On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> >
> > 2) The following two places are not consistent in whether "= value" is surround
> > with square brackets.
> >
> > +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>[= <replaceable class="parameter">value</replaceable>] [, ... ] )
 
> >
> > +    <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable
class="parameter">value</replaceable>[, ... ] )</literal></term>
 
> >
> > Should we modify the first place to:
> > +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>= <replaceable class="parameter">value</replaceable> [, ... ] )
 
> >
> > Because currently there is only one skip_option - xid, and a parameter must be
> > specified when using it.
> >
>
> Good observation. Do we really need [, ... ] as currently, we support
> only one value for XID?

I think no. In the doc, it should be:

ALTER SUBSCRIPTION name SKIP ( skip_option = value )

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 января 2022 г., 07:39:01

On Tue, Jan 18, 2022 at 12:04 PM tanghy.fnst@fujitsu.com
<tanghy.fnst@fujitsu.com> wrote:
>
> On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated patch. Please review it.
> >
>
> Thanks for updating the patch. Few comments:
>
> 1)
>                 /* Two_phase is only supported in v15 and higher */
>                 if (pset.sversion >= 150000)
>                         appendPQExpBuffer(&buf,
> -                                                         ", subtwophasestate AS \"%s\"\n",
> -                                                         gettext_noop("Two phase commit"));
> +                                                         ", subtwophasestate AS \"%s\"\n"
> +                                                         ", subskipxid AS \"%s\"\n",
> +                                                         gettext_noop("Two phase commit"),
> +                                                         gettext_noop("Skip XID"));
>
>                 appendPQExpBuffer(&buf,
>                                                   ",  subsynccommit AS \"%s\"\n"
>
> I think "skip xid" should be mentioned in the comment. Maybe it could be changed to:
> "Two_phase and skip XID are only supported in v15 and higher"

Added.

>
> 2) The following two places are not consistent in whether "= value" is surround
> with square brackets.
>
> +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>[= <replaceable class="parameter">value</replaceable>] [, ... ] )
 
>
> +    <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable
class="parameter">value</replaceable>[, ... ] )</literal></term>
 
>
> Should we modify the first place to:
> +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>= <replaceable class="parameter">value</replaceable> [, ... ] )
 
>
> Because currently there is only one skip_option - xid, and a parameter must be
> specified when using it.

Good catch. Fixed.

>
> 3)
> +        * Protect subskip_xid of pg_subscription from being concurrently updated
> +        * while clearing it.
>
> "subskip_xid" should be "subskipxid" I think.

Fixed.

>
> 4)
> +/*
> + * Start skipping changes of the transaction if the given XID matches the
> + * transaction ID specified by skip_xid option.
> + */
>
> The option name was "skip_xid" in the previous version, and it is "xid" in
> latest patch. So should we modify "skip_xid option" to "skip xid option", or
> "skip option xid", or something else?
>
> Also the following place has similar issue:
> + * the subscription if hte user has specified skip_xid. Once we start skipping

Fixed.

I've attached an updated patch. All comments I got so far were
incorporated into this patch unless I'm missing something.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v7-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 января 2022 г., 07:43:07

On Tue, Jan 18, 2022 at 12:20 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Monday, January 17, 2022 9:52 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > Thank you for the comments!
> ..
> > > (2) Minor improvement suggestion of comment in
> > > src/backend/replication/logical/worker.c
> > >
> > > + * reset during that.  Also, we don't skip receiving the changes in
> > > + streaming
> > > + * cases, since we decide whether or not to skip applying the changes
> > > + when
> > >
> > > I sugguest that you don't use 'streaming cases', because what
> > > "streaming cases" means sounds a bit broader than actual your
> > implementation.
> > > We do skip transaction of streaming cases but not during the spooling phase,
> > right ?
> > >
> > > I suggest below.
> > >
> > > "We don't skip receiving the changes at the phase to spool streaming
> > transactions"
> >
> > I might be missing your point but I think it's correct that we don't skip receiving
> > the change of the transaction that is sent via streaming protocol. And it doesn't
> > sound broader to me. Could you elaborate on that?
> OK. Excuse me for lack of explanation.
>
> I felt "streaming cases" implies "non-streaming cases"
> to compare a diffference (in my head) when it is
> used to explain something at first.
> I imagined the contrast between those, when I saw it.
>
> Thus, I thought "streaming cases" meant
> whole flow of streaming transactions which consists of messages
> surrounded by stream start and stream stop and which are finished by
> stream commit/stream abort (including 2PC variations).
>
> When I come back to the subject, you wrote below in the comment
>
> "we don't skip receiving the changes in streaming cases,
> since we decide whether or not to skip applying the changes
> when starting to apply changes"
>
> The first part of this sentence
> ("we don't skip receiving the changes in streaming cases")
> gives me an impression where we don't skip changes in the streaming cases
> (of my understanding above), but the last part
> ("we decide whether or not to skip applying the changes
> when starting to apply change") means we skip transactions for streaming at apply phase.
>
> So, this sentence looked confusing to me slightly.
> Thus, I suggested below (and when I connect it with existing part)
>
> "we don't skip receiving the changes at the phase to spool streaming transactions
> since we decide whether or not to skip applying the changes when starting to apply changes"
>
> For me this looked better, but of course, this is a suggestion.

Thank you for your explanation.

I've modified the comment with some changes since "the phase to spool
streaming transaction" seems not commonly be used in worker.c.

>
> > >
> > > (3) in the comment of apply_handle_prepare_internal, two full-width
> > characters.
> > >
> > > 3-1
> > > +        * won’t be resent in a case where the server crashes between
> > them.
> > >
> > > 3-2
> > > +        * COMMIT PREPARED or ROLLBACK PREPARED. But that’s okay
> > > + because this
> > >
> > > You have full-width characters for "won't" and "that's".
> > > Could you please check ?
> >
> > Which characters in "won't" are full-width characters? I could not find them.
> All characters I found and mentioned as full-width are single quotes.
>
> It might be good that you check the entire patch once
> by some tool that helps you to detect it.

Thanks!

>
> > > (5)
> > >
> > > I can miss something here but, in one of the past discussions, there
> > > seems a consensus that if the user specifies XID of a subtransaction,
> > > it would be better to skip only the subtransaction.
> > >
> > > This time, is it out of the range of the patch ?
> > > If so, I suggest you include some description about it either in the
> > > commit message or around codes related to it.
> >
> > How can the user know subtransaction XID? I suppose you refer to streaming
> > protocol cases but while applying spooled changes we don't report
> > subtransaction XID neither in server log nor pg_stat_subscription_workers.
> Yeah, usually subtransaction XID is not exposed to the users. I agree.
>
> But, clarifying the target of this feature is only top-level transactions
> sounds better to me. Thank you Amit-san for your support
> about how we should write it in [1] !

Yes, I've included the sentence proposed by Amit in the latest patch.

>
> > > (6)
> > >
> > > I feel it's a better idea to include a test whether to skip aborted
> > > streaming transaction clears the XID in the TAP test for this feature,
> > > in a sense to cover various new code paths. Did you have any special
> > > reason to omit the case ?
> >
> > Which code path is newly covered by this aborted streaming transaction tests?
> > I think that this patch is already covered even by the test for a
> > committed-and-streamed transaction. It doesn't matter whether the streamed
> > transaction is committed or aborted because an error occurs while applying the
> > spooled changes.
> Oh, this was my mistake.  What I expressed as a new patch is
> apply_handle_stream_abort -> clear_subscription_skip_xid.
> But, this was totally wrong as you explained.
>
>
> > >
> > > (7)
> > >
> > > I want more explanation for the reason to restart the subscriber in
> > > the TAP test because this is not mandatory operation.
> > > (We can pass the TAP tests without this restart)
> > >
> > > From :
> > > # Restart the subscriber node to restart logical replication with no
> > > interval
> > >
> > > IIUC, below would be better.
> > >
> > > To :
> > > # As an optimization to finish tests earlier, restart the subscriber
> > > with no interval, # rather than waiting for new error to laucher a new apply
> > worker.
> >
> > I could not understand why the proposed sentence has more information.
> > Does it mean you want to mention "As an optimization to finish tests earlier"?
> Yes, exactly. The point is to add "As an optimization to finish tests earlier".
>
> Probably, I should have asked a simple question "why do you restart the subscriber" ?
> At first sight, I couldn't understand the meaning for the restart and
> you don't explain the reason itself.

I thought "to restart logical replication with no interval" explains
the reason why we restart the subscriber. I left this part but we can
change it later if others also want to do that change.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

18 января 2022 г., 08:37:35

On Tuesday, January 18, 2022 1:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've attached an updated patch. All comments I got so far were incorporated
> into this patch unless I'm missing something.

Hi, thank you for your new patch v7.
For your information, I've encountered a failure to apply patch v7
on top of the latest commit (d3f4532)

$ git am v7-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch
Applying: Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on subscriber nodes
error: patch failed: src/backend/parser/gram.y:9954
error: src/backend/parser/gram.y: patch does not apply

Could you please rebase it when it's necessary ?

Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 января 2022 г., 09:05:10

On Tue, Jan 18, 2022 at 2:37 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Tuesday, January 18, 2022 1:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached an updated patch. All comments I got so far were incorporated
> > into this patch unless I'm missing something.
>
> Hi, thank you for your new patch v7.
> For your information, I've encountered a failure to apply patch v7
> on top of the latest commit (d3f4532)
>
> $ git am v7-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch
> Applying: Add ALTER SUBSCRIPTION ... SKIP to skip the transaction on subscriber nodes
> error: patch failed: src/backend/parser/gram.y:9954
> error: src/backend/parser/gram.y: patch does not apply
>
> Could you please rebase it when it's necessary ?

Thank you for reporting!

I've attached a rebased patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v8-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

19 января 2022 г., 06:22:08

On Tuesday, January 18, 2022 3:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've attached a rebased patch.
Thank you for your rebase !

Several review comments on v8.

(1) doc/src/sgml/logical-replication.sgml

+
+  <para>
+   To resolve conflicts, you need to consider changing the data on the subscriber so
+   that it doesn't conflict with incoming changes, or dropping the conflicting constraint
+   or unique index, or writing a trigger on the subscriber to suppress or redirect
+   conflicting incoming changes, or as a last resort, by skipping the whole transaction.
+   Skipping the whole transaction includes skipping changes that may not violate
+   any constraint.  This can easily make the subscriber inconsistent, especially if
+   a user specifies the wrong transaction ID or the position of origin.
+  </para>

The first sentence is too long and lack of readability slightly.
One idea to sort out listing items is to utilize "itemizedlist".
For instance, I imagined something like below.

  <para>
    To resolve conflicts, you need to consider following actions:
    <itemizedlist>
      <listitem>
        <para>
          Change the data on the subscriber so that it doesn't conflict with incoming changes
        </para>
      </listitem>
      ...
      <listitem>
        <para>
          As a last resort, skip the whole transaction
        </para>
      </listitem>
    </itemizedlist>
    ....
  </para>

What did you think ?

By the way, in case only when you want to keep the current sentence style,
I have one more question. Do we need "by" in the part
"by skipping the whole transaction" ? If we focus on only this action,
I think the sentence becomes "you need to consider skipping the whole transaction".
If this is true, we don't need "by" in the part.

(2)

Also, in the same paragraph, we write

+ ... This can easily make the subscriber inconsistent, especially if
+   a user specifies the wrong transaction ID or the position of origin.

The subject of this sentence should be "Those" or "Some of those" ?
because we want to mention either "new skip xid feature" or
"pg_replication_origin_advance".

(3) doc/src/sgml/ref/alter_subscription.sgml

Below change contains unnecessary spaces.
+      the whole transaction.  Using <command> ALTER SUBSCRIPTION ... SKIP </command>

Need to change
From:
<command> ALTER SUBSCRIPTION ... SKIP </command>
To:
<command>ALTER SUBSCRIPTION ... SKIP</command>

(4) comment in clear_subscription_skip_xid

+        * the flush position the transaction will be sent again and the user
+        * needs to be set subskipxid again.  We can reduce the possibility by

Shoud change
From:
the user needs to be set...
To:
the user needs to set...

(5) clear_subscription_skip_xid

+       if (!HeapTupleIsValid(tup))
+               elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);

Can we change it to ereport with ERRCODE_UNDEFINED_OBJECT ?
This suggestion has another aspect that in within one patch, we don't mix 
both ereport and elog at the same time.

(6) apply_handle_stream_abort

@@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)

        logicalrep_read_stream_abort(s, &xid, &subxid);

+       /*
+        * We don't expect the user to set the XID of the transaction that is
+        * rolled back but if the skip XID is set, clear it.
+        */
+       if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
+               clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
+

In my humble opinion, this still cares about subtransaction xid still.
If we want to be consistent with top level transactions only,
I felt checking MySubscription->skipxid == xid should be sufficient.

Below is an *insame* (in a sense not correct usage) scenario
to hit the "MySubscription->skipxid == subxid".
Sorry if it is not perfect.

-------
Set logical_decoding_work_mem = 64.
Create tables named 'tab' with a column id (integer);
Create pub and sub with streaming = true.
No initial data is required on both nodes
because we just want to issue stream_abort
after executing skip xid feature.

<Session1> to the publisher
begin;
select pg_current_xact_id(); -- for reference
insert into tab values (1);
savepoint s1;
insert into tab values (2);
savepoint s2;
insert into tab values (generate_series(1001, 2000));
select ctid, xmin, xmax, id from tab where id in (1, 2, 1001);

<Session2> to the subscriber
select subname, subskipxid from pg_subscription; -- shows 0
alter subscription mysub skip (xid = xxx); -- xxx is that of xmin for 1001 on the publisher
select subname, subskipxid from pg_subscription; -- check it shows xxx just in case

<Session1>
rollback to s1;
commit;
select * from tab; -- shows only data '1'.

<Session2>
select subname, subskipxid from pg_subscription; -- shows 0. subskipxid was reset by the skip xid feature
select count(1) = 1 from tab; -- shows true

FYI: the commands result of those last two commands.
postgres=# select subname, subskipxid from pg_subscription;
 subname | subskipxid 
---------+------------
 mysub   |          0
(1 row)

postgres=# select count(1) = 1 from tab;
 ?column? 
----------
 t
(1 row)

Thus, it still cares about subtransactions and clear the subskipxid.
Should we fix this behavior for consistency ?

Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

vignesh C

Дата:

19 января 2022 г., 09:32:19

On Sat, Jan 15, 2022 at 3:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 14, 2022 at 5:35 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Thanks for the updated patch, few minor comments:
> > 1) Should "SKIP" be "SKIP (" here:
> > @@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
> >         /* ALTER SUBSCRIPTION <name> */
> >         else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
> >                 COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
> > -                                         "RENAME TO", "REFRESH
> > PUBLICATION", "SET",
> > +                                         "RENAME TO", "REFRESH
> > PUBLICATION", "SET", "SKIP",
> >
>
> Won't the another rule as follows added by patch sufficient for what
> you are asking?
> + /* ALTER SUBSCRIPTION <name> SKIP */
> + else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
> + COMPLETE_WITH("(");
>
> I might be missing something but why do you think the handling of SKIP
> be any different than what we are doing for SET?

In case of "ALTER SUBSCRIPTION sub1 SET" there are 2 possible  tab
completion options, user can either specify "ALTER SUBSCRIPTION sub1
SET PUBLICATION pub1" or "ALTER SUBSCRIPTION sub1 SET ( SET option
like STREAMING,etc = 'on')", that is why we have 2 possible options as
below:
postgres=# ALTER SUBSCRIPTION sub1 SET
(            PUBLICATION

Whereas in the case of SKIP there is only one possible tab completion
option i.e XID. We handle similarly in case of WITH option, we specify
"WITH (" in case of tab completion for "CREATE PUBLICATION pub1"
postgres=# CREATE PUBLICATION pub1
FOR ALL TABLES            FOR ALL TABLES IN SCHEMA  FOR TABLE
       WITH (

Regards,
Vignesh

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

19 января 2022 г., 10:14:28

On Tue, Jan 18, 2022 at 5:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached a rebased patch.

A couple of comments for the v8 patch:

doc/src/sgml/logical-replication.sgml

(1)
Strictly-speaking it's the transaction, not transaction ID, that
contains changes, so suggesting minor change:

BEFORE:
+   The transaction ID that contains the change violating the constraint can be
AFTER:
+   The ID of the transaction that contains the change violating the
constraint can be


doc/src/sgml/ref/alter_subscription.sgml

(2) apply_handle_commit_internal
It's not entirely apparent what commits the clearing of subskixpid
here, so I suggest the following addition:

BEFORE:
+ * clear subskipxid of pg_subscription.
AFTER:
+ * clear subskipxid of pg_subscription, then commit.


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

19 января 2022 г., 10:15:56

On Wed, Jan 19, 2022 at 12:22 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Tuesday, January 18, 2022 3:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached a rebased patch.
> Thank you for your rebase !
>
> Several review comments on v8.

Thank you for the comments!

>
> (1) doc/src/sgml/logical-replication.sgml
>
> +
> +  <para>
> +   To resolve conflicts, you need to consider changing the data on the subscriber so
> +   that it doesn't conflict with incoming changes, or dropping the conflicting constraint
> +   or unique index, or writing a trigger on the subscriber to suppress or redirect
> +   conflicting incoming changes, or as a last resort, by skipping the whole transaction.
> +   Skipping the whole transaction includes skipping changes that may not violate
> +   any constraint.  This can easily make the subscriber inconsistent, especially if
> +   a user specifies the wrong transaction ID or the position of origin.
> +  </para>
>
> The first sentence is too long and lack of readability slightly.
> One idea to sort out listing items is to utilize "itemizedlist".
> For instance, I imagined something like below.
>
>   <para>
>     To resolve conflicts, you need to consider following actions:
>     <itemizedlist>
>       <listitem>
>         <para>
>           Change the data on the subscriber so that it doesn't conflict with incoming changes
>         </para>
>       </listitem>
>       ...
>       <listitem>
>         <para>
>           As a last resort, skip the whole transaction
>         </para>
>       </listitem>
>     </itemizedlist>
>     ....
>   </para>
>
> What did you think ?
>
> By the way, in case only when you want to keep the current sentence style,
> I have one more question. Do we need "by" in the part
> "by skipping the whole transaction" ? If we focus on only this action,
> I think the sentence becomes "you need to consider skipping the whole transaction".
> If this is true, we don't need "by" in the part.

I personally prefer to keep the current sentence since listing them
seems not suitable in this case. But I agree that "by" is not
necessary here.

>
> (2)
>
> Also, in the same paragraph, we write
>
> + ... This can easily make the subscriber inconsistent, especially if
> +   a user specifies the wrong transaction ID or the position of origin.
>
> The subject of this sentence should be "Those" or "Some of those" ?
> because we want to mention either "new skip xid feature" or
> "pg_replication_origin_advance".

I think "This" in the sentence refers to "Skipping the whole
transaction". In the previous paragraph, we describe that there are
two methods for skipping the whole transaction: this new feature and
pg_replication_origin_advance(). And in this paragraph, we don't
mention any specific methods for skipping the whole transaction but
describe that skipping the whole transaction per se can easily make
the subscriber inconsistent. The current structure is fine with me.

>
> (3) doc/src/sgml/ref/alter_subscription.sgml
>
> Below change contains unnecessary spaces.
> +      the whole transaction.  Using <command> ALTER SUBSCRIPTION ... SKIP </command>
>
> Need to change
> From:
> <command> ALTER SUBSCRIPTION ... SKIP </command>
> To:
> <command>ALTER SUBSCRIPTION ... SKIP</command>

Will remove.

>
> (4) comment in clear_subscription_skip_xid
>
> +        * the flush position the transaction will be sent again and the user
> +        * needs to be set subskipxid again.  We can reduce the possibility by
>
> Shoud change
> From:
> the user needs to be set...
> To:
> the user needs to set...

Will remove.

>
> (5) clear_subscription_skip_xid
>
> +       if (!HeapTupleIsValid(tup))
> +               elog(ERROR, "subscription \"%s\" does not exist", MySubscription->name);
>
> Can we change it to ereport with ERRCODE_UNDEFINED_OBJECT ?
> This suggestion has another aspect that in within one patch, we don't mix
> both ereport and elog at the same time.

I don’t think we need to set errcode since this error is a
should-not-happen error.

>
> (6) apply_handle_stream_abort
>
> @@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
>
>         logicalrep_read_stream_abort(s, &xid, &subxid);
>
> +       /*
> +        * We don't expect the user to set the XID of the transaction that is
> +        * rolled back but if the skip XID is set, clear it.
> +        */
> +       if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
> +               clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
> +
>
> In my humble opinion, this still cares about subtransaction xid still.
> If we want to be consistent with top level transactions only,
> I felt checking MySubscription->skipxid == xid should be sufficient.

I thought if we can clear subskipxid whose value has already been
processed on the subscriber with a reasonable cost it makes sense to
do that because it can reduce the possibility of the issue that XID is
wraparound while leaving the wrong in subskipxid. But as you pointed
out, the current behavior doesn’t match the description in the doc:

After the logical replication successfully skips the transaction, the
transaction ID (stored in pg_subscription.subskipxid) is cleared.

and

We don't support skipping individual subtransactions.

I'll remove it in the next version patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

19 января 2022 г., 11:57:51

On Wed, Jan 19, 2022 at 12:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 19, 2022 at 12:22 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > (6) apply_handle_stream_abort
> >
> > @@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
> >
> >         logicalrep_read_stream_abort(s, &xid, &subxid);
> >
> > +       /*
> > +        * We don't expect the user to set the XID of the transaction that is
> > +        * rolled back but if the skip XID is set, clear it.
> > +        */
> > +       if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
> > +               clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
> > +
> >
> > In my humble opinion, this still cares about subtransaction xid still.
> > If we want to be consistent with top level transactions only,
> > I felt checking MySubscription->skipxid == xid should be sufficient.
>
> I thought if we can clear subskipxid whose value has already been
> processed on the subscriber with a reasonable cost it makes sense to
> do that because it can reduce the possibility of the issue that XID is
> wraparound while leaving the wrong in subskipxid.
>

I guess that could happen if the user sets some unrelated XID value.
So, I think it should be okay to not clear this but we can add a
comment in the code at that place that we don't clear subtransaction's
XID as we don't support skipping individual subtransactions or
something like that.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

20 января 2022 г., 19:18:51

On 18.01.22 07:05, Masahiko Sawada wrote:
> I've attached a rebased patch.

I think this is now almost done.  Attached I have a small fixup patch 
with some documentation proof-reading, and removing some comments I felt 
are redundant.  Some others have also sent you some documentation 
updates, so feel free to merge mine in with them.

Some other comments:

parse_subscription_options() and AlterSubscriptionStmt mixes regular 
options and skip options in ways that confuse me.  It seems to work 
correctly, though.  I guess for now it's okay, but if we add more skip 
options, it might be better to separate those more cleanly.

I think the superuser check in AlterSubscription() might no longer be 
appropriate.  Subscriptions can now be owned by non-superusers.  Please 
check that.

The display order in psql \dRs+ is a bit odd.  I would put it at the 
end, certainly not between Two phase commit and Synchronous commit.

Please run pgperltidy over 028_skip_xact.pl.

Is the setting of logical_decoding_work_mem in the test script required? 
  If so, comment why.

Please document arguments origin_lsn and origin_timestamp of
stop_skipping_changes().  Otherwise, one has to dig quite deep to find
out what they are for.

This is all minor stuff, so I think when this and the nearby comments 
are addressed, this is fine by me.

Вложения

0001-fixup-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-tran.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

21 января 2022 г., 06:08:28

On Fri, Jan 21, 2022 at 1:18 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 18.01.22 07:05, Masahiko Sawada wrote:
> > I've attached a rebased patch.
>
> I think this is now almost done.  Attached I have a small fixup patch
> with some documentation proof-reading, and removing some comments I felt
> are redundant.  Some others have also sent you some documentation
> updates, so feel free to merge mine in with them.

Thank you for reviewing the patch and attaching the fixup patch!

>
> Some other comments:
>
> parse_subscription_options() and AlterSubscriptionStmt mixes regular
> options and skip options in ways that confuse me.  It seems to work
> correctly, though.  I guess for now it's okay, but if we add more skip
> options, it might be better to separate those more cleanly.

Agreed.

>
> I think the superuser check in AlterSubscription() might no longer be
> appropriate.  Subscriptions can now be owned by non-superusers.  Please
> check that.

IIUC we don't allow non-superuser to own the subscription yet. We
still have the following superuser checks:

In CreateSubscription():

    if (!superuser())
        ereport(ERROR,
                (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
                 errmsg("must be superuser to create subscriptions")));

and in AlterSubscriptionOwner_internal();

    /* New owner must be a superuser */
    if (!superuser_arg(newOwnerId))
        ereport(ERROR,
                (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
                 errmsg("permission denied to change owner of
subscription \"%s\"",
                        NameStr(form->subname)),
                 errhint("The owner of a subscription must be a superuser.")));

Also, doing superuser check here seems to be consistent with
pg_replication_origin_advance() which is another way to skip
transactions and also requires superuser permission.

>
> The display order in psql \dRs+ is a bit odd.  I would put it at the
> end, certainly not between Two phase commit and Synchronous commit.

Fixed.

>
> Please run pgperltidy over 028_skip_xact.pl.

Fixed.

>
> Is the setting of logical_decoding_work_mem in the test script required?
>   If so, comment why.

Yes, it makes the tests check streaming logical replication cases
easily. Added the comment.

>
> Please document arguments origin_lsn and origin_timestamp of
> stop_skipping_changes().  Otherwise, one has to dig quite deep to find
> out what they are for.

Added.

Also, after reading the documentation updates, I realized that there
are two paragraphs describing almost the same things so merged them.
Please check the doc updates in the latest patch.

I've attached an updated patch that incorporated these commends as
well as other comments I got so far.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v9-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transac.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

21 января 2022 г., 06:11:05

On Wed, Jan 19, 2022 at 3:32 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Sat, Jan 15, 2022 at 3:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jan 14, 2022 at 5:35 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Thanks for the updated patch, few minor comments:
> > > 1) Should "SKIP" be "SKIP (" here:
> > > @@ -1675,7 +1675,7 @@ psql_completion(const char *text, int start, int end)
> > >         /* ALTER SUBSCRIPTION <name> */
> > >         else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
> > >                 COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
> > > -                                         "RENAME TO", "REFRESH
> > > PUBLICATION", "SET",
> > > +                                         "RENAME TO", "REFRESH
> > > PUBLICATION", "SET", "SKIP",
> > >
> >
> > Won't the another rule as follows added by patch sufficient for what
> > you are asking?
> > + /* ALTER SUBSCRIPTION <name> SKIP */
> > + else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "SKIP"))
> > + COMPLETE_WITH("(");
> >
> > I might be missing something but why do you think the handling of SKIP
> > be any different than what we are doing for SET?
>
> In case of "ALTER SUBSCRIPTION sub1 SET" there are 2 possible  tab
> completion options, user can either specify "ALTER SUBSCRIPTION sub1
> SET PUBLICATION pub1" or "ALTER SUBSCRIPTION sub1 SET ( SET option
> like STREAMING,etc = 'on')", that is why we have 2 possible options as
> below:
> postgres=# ALTER SUBSCRIPTION sub1 SET
> (            PUBLICATION
>
> Whereas in the case of SKIP there is only one possible tab completion
> option i.e XID. We handle similarly in case of WITH option, we specify
> "WITH (" in case of tab completion for "CREATE PUBLICATION pub1"
> postgres=# CREATE PUBLICATION pub1
> FOR ALL TABLES            FOR ALL TABLES IN SCHEMA  FOR TABLE
>        WITH (

Right. I've incorporated this comment into the latest v9 patch[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoDOuNtvFUfU2wH2QgTJ6AyMXXh_vdA87qX0mUibdsrYTg%40mail.gmail.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

21 января 2022 г., 06:13:25

On Wed, Jan 19, 2022 at 4:14 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Tue, Jan 18, 2022 at 5:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached a rebased patch.
>
> A couple of comments for the v8 patch:

Thank you for the comments!

>
> doc/src/sgml/logical-replication.sgml
>
> (1)
> Strictly-speaking it's the transaction, not transaction ID, that
> contains changes, so suggesting minor change:
>
> BEFORE:
> +   The transaction ID that contains the change violating the constraint can be
> AFTER:
> +   The ID of the transaction that contains the change violating the
> constraint can be
>
>
> doc/src/sgml/ref/alter_subscription.sgml
>
> (2) apply_handle_commit_internal
> It's not entirely apparent what commits the clearing of subskixpid
> here, so I suggest the following addition:
>
> BEFORE:
> + * clear subskipxid of pg_subscription.
> AFTER:
> + * clear subskipxid of pg_subscription, then commit.
>

These comments are merged with Peter's comments and incorporated into
the latest v9 patch[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoDOuNtvFUfU2wH2QgTJ6AyMXXh_vdA87qX0mUibdsrYTg%40mail.gmail.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

21 января 2022 г., 06:14:05

On Wed, Jan 19, 2022 at 5:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jan 19, 2022 at 12:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jan 19, 2022 at 12:22 PM osumi.takamichi@fujitsu.com
> > <osumi.takamichi@fujitsu.com> wrote:
> > >
> > > (6) apply_handle_stream_abort
> > >
> > > @@ -1209,6 +1300,13 @@ apply_handle_stream_abort(StringInfo s)
> > >
> > >         logicalrep_read_stream_abort(s, &xid, &subxid);
> > >
> > > +       /*
> > > +        * We don't expect the user to set the XID of the transaction that is
> > > +        * rolled back but if the skip XID is set, clear it.
> > > +        */
> > > +       if (MySubscription->skipxid == xid || MySubscription->skipxid == subxid)
> > > +               clear_subscription_skip_xid(MySubscription->skipxid, InvalidXLogRecPtr, 0);
> > > +
> > >
> > > In my humble opinion, this still cares about subtransaction xid still.
> > > If we want to be consistent with top level transactions only,
> > > I felt checking MySubscription->skipxid == xid should be sufficient.
> >
> > I thought if we can clear subskipxid whose value has already been
> > processed on the subscriber with a reasonable cost it makes sense to
> > do that because it can reduce the possibility of the issue that XID is
> > wraparound while leaving the wrong in subskipxid.
> >
>
> I guess that could happen if the user sets some unrelated XID value.
> So, I think it should be okay to not clear this but we can add a
> comment in the code at that place that we don't clear subtransaction's
> XID as we don't support skipping individual subtransactions or
> something like that.

Agreed and added the comment in the latest patch[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoDOuNtvFUfU2wH2QgTJ6AyMXXh_vdA87qX0mUibdsrYTg%40mail.gmail.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

21 января 2022 г., 06:20:23

On Fri, Jan 21, 2022 at 8:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 1:18 AM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> >
> > I think the superuser check in AlterSubscription() might no longer be
> > appropriate.  Subscriptions can now be owned by non-superusers.  Please
> > check that.
>
> IIUC we don't allow non-superuser to own the subscription yet. We
> still have the following superuser checks:
>
> In CreateSubscription():
>
>     if (!superuser())
>         ereport(ERROR,
>                 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>                  errmsg("must be superuser to create subscriptions")));
>
> and in AlterSubscriptionOwner_internal();
>
>     /* New owner must be a superuser */
>     if (!superuser_arg(newOwnerId))
>         ereport(ERROR,
>                 (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>                  errmsg("permission denied to change owner of
> subscription \"%s\"",
>                         NameStr(form->subname)),
>                  errhint("The owner of a subscription must be a superuser.")));
>
> Also, doing superuser check here seems to be consistent with
> pg_replication_origin_advance() which is another way to skip
> transactions and also requires superuser permission.
>

+1. I think this feature has the potential to make data inconsistent
and only be used as a last resort to resolve the conflicts so it is
better to allow this as a superuser.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

21 января 2022 г., 07:20:41

On Tue, Jan 18, 2022 at 9:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jan 18, 2022 at 12:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jan 18, 2022 at 8:34 AM tanghy.fnst@fujitsu.com
> > <tanghy.fnst@fujitsu.com> wrote:
> > >
> > > On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > >
> > > 2) The following two places are not consistent in whether "= value" is surround
> > > with square brackets.
> > >
> > > +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>[= <replaceable class="parameter">value</replaceable>] [, ... ] )
 
> > >
> > > +    <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable
class="parameter">value</replaceable>[, ... ] )</literal></term>
 
> > >
> > > Should we modify the first place to:
> > > +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>= <replaceable class="parameter">value</replaceable> [, ... ] )
 
> > >
> > > Because currently there is only one skip_option - xid, and a parameter must be
> > > specified when using it.
> > >
> >
> > Good observation. Do we really need [, ... ] as currently, we support
> > only one value for XID?
>
> I think no. In the doc, it should be:
>
> ALTER SUBSCRIPTION name SKIP ( skip_option = value )
>

In the latest patch, I see:
+   <varlistentry>
+    <term><literal>SKIP ( <replaceable
class="parameter">skip_option</replaceable> = <replaceable
class="parameter">value</replaceable> [, ... ] )</literal></term>

What do we want to indicate by [, ... ]? To me, it appears like
multiple options but that is not what we support currently.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

21 января 2022 г., 07:40:17

On Fri, Jan 21, 2022 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 18, 2022 at 9:21 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jan 18, 2022 at 12:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jan 18, 2022 at 8:34 AM tanghy.fnst@fujitsu.com
> > > <tanghy.fnst@fujitsu.com> wrote:
> > > >
> > > > On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > >
> > > > 2) The following two places are not consistent in whether "= value" is surround
> > > > with square brackets.
> > > >
> > > > +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>[= <replaceable class="parameter">value</replaceable>] [, ... ] )
 
> > > >
> > > > +    <term><literal>SKIP ( <replaceable class="parameter">skip_option</replaceable> = <replaceable
class="parameter">value</replaceable>[, ... ] )</literal></term>
 
> > > >
> > > > Should we modify the first place to:
> > > > +ALTER SUBSCRIPTION <replaceable class="parameter">name</replaceable> SKIP ( <replaceable
class="parameter">skip_option</replaceable>= <replaceable class="parameter">value</replaceable> [, ... ] )
 
> > > >
> > > > Because currently there is only one skip_option - xid, and a parameter must be
> > > > specified when using it.
> > > >
> > >
> > > Good observation. Do we really need [, ... ] as currently, we support
> > > only one value for XID?
> >
> > I think no. In the doc, it should be:
> >
> > ALTER SUBSCRIPTION name SKIP ( skip_option = value )
> >
>
> In the latest patch, I see:
> +   <varlistentry>
> +    <term><literal>SKIP ( <replaceable
> class="parameter">skip_option</replaceable> = <replaceable
> class="parameter">value</replaceable> [, ... ] )</literal></term>
>
> What do we want to indicate by [, ... ]? To me, it appears like
> multiple options but that is not what we support currently.

You're right. It's an oversight.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

21 января 2022 г., 08:02:45

On Friday, January 21, 2022 12:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've attached an updated patch that incorporated these commends as well as
> other comments I got so far.
Thank you for your update !

Few minor comments.

(1) trivial question

For the users,
was it perfectly clear that in the cascading logical replication setup,
we can't selectively skip an arbitrary transaction of one upper nodes,
without skipping its all executions on subsequent nodes,
when we refer to the current doc description of v9 ?

IIUC, this is because we don't write changes WAL either and
can't propagate the contents to subsequent nodes.

I tested this case and it didn't, as I expected.
This can apply to other measures for conflicts, though.

(2) suggestion

There's no harm in writing a notification for a committer
"Bump catalog version" in the commit log,
as the patch changes the catalog.

(3) minor question

In the past, there was a discussion that
it might be better if we reset the XID
according to a change of subconninfo,
which might be an opportunity to connect another
publisher of a different XID space.
Currently, we can regard it as user's responsibility.
Was this correct ?

Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

21 января 2022 г., 08:29:45

On Fri, Jan 21, 2022 at 10:32 AM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Friday, January 21, 2022 12:08 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached an updated patch that incorporated these commends as well as
> > other comments I got so far.
> Thank you for your update !
>
> Few minor comments.
>
> (1) trivial question
>
> For the users,
> was it perfectly clear that in the cascading logical replication setup,
> we can't selectively skip an arbitrary transaction of one upper nodes,
> without skipping its all executions on subsequent nodes,
> when we refer to the current doc description of v9 ?
>
> IIUC, this is because we don't write changes WAL either and
> can't propagate the contents to subsequent nodes.
>
> I tested this case and it didn't, as I expected.
> This can apply to other measures for conflicts, though.
>

Right, there is nothing new as the user will same effect when she uses
existing function pg_replication_origin_advance(). So, not sure if we
want to add something specific to this.

>
> (3) minor question
>
> In the past, there was a discussion that
> it might be better if we reset the XID
> according to a change of subconninfo,
> which might be an opportunity to connect another
> publisher of a different XID space.
> Currently, we can regard it as user's responsibility.
> Was this correct ?
>

I think if the user points to another publisher, doesn't it similarly
needs to change slot_name as well? If so, I think this can be treated
in a similar way.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Greg Nancarrow

Дата:

21 января 2022 г., 08:50:40

On Fri, Jan 21, 2022 at 2:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached an updated patch that incorporated these commends as
> well as other comments I got so far.
>

src/backend/replication/logical/worker.c

(1)
Didn't you mean to say "check the" instead of "clear" in the following
comment? (the subtransaction's XID was never being cleared before,
just checked against the skipxid, and now that check has been removed)

+ * ...      .  Since we don't
+ * support skipping individual subtransactions we don't clear
+ * subtransaction's XID.

Other than that, the patch LGTM.

Regards,
Greg Nancarrow
Fujitsu Australia

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

21 января 2022 г., 10:45:06

On Friday, January 21, 2022 2:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Jan 21, 2022 at 10:32 AM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On Friday, January 21, 2022 12:08 PM Masahiko Sawada
> <sawada.mshk@gmail.com> wrote:
> > > I've attached an updated patch that incorporated these commends as
> > > well as other comments I got so far.
> > Thank you for your update !
> >
> > Few minor comments.
> >
> > (1) trivial question
> >
> > For the users,
> > was it perfectly clear that in the cascading logical replication
> > setup, we can't selectively skip an arbitrary transaction of one upper
> > nodes, without skipping its all executions on subsequent nodes, when
> > we refer to the current doc description of v9 ?
> >
> > IIUC, this is because we don't write changes WAL either and can't
> > propagate the contents to subsequent nodes.
> >
> > I tested this case and it didn't, as I expected.
> > This can apply to other measures for conflicts, though.
> >
> 
> Right, there is nothing new as the user will same effect when she uses existing
> function pg_replication_origin_advance(). So, not sure if we want to add
> something specific to this.
Okay, thank you for clarifying this !
That's good to know.


> > (3) minor question
> >
> > In the past, there was a discussion that it might be better if we
> > reset the XID according to a change of subconninfo, which might be an
> > opportunity to connect another publisher of a different XID space.
> > Currently, we can regard it as user's responsibility.
> > Was this correct ?
> >
> 
> I think if the user points to another publisher, doesn't it similarly needs to
> change slot_name as well? If so, I think this can be treated in a similar way.
I see. Then, in the AlterSubscription(), switching a slot_name
doesn't affect other columns, which means this time,
we don't need some special measure for this either as well, IIUC.
Thanks !

Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

21 января 2022 г., 14:55:22

On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > What do we want to indicate by [, ... ]? To me, it appears like
> > multiple options but that is not what we support currently.
>
> You're right. It's an oversight.
>

I have fixed this and a few other things in the attached patch.
1.
The newly added column needs to be updated in the following statement:
-- All columns of pg_subscription except subconninfo are publicly readable.
REVOKE ALL ON pg_subscription FROM public;
GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
              substream, subtwophasestate, subslotname, subsynccommit,
subpublications)
    ON pg_subscription TO public;

2.
+stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
+   TimestampTz origin_timestamp)
+{
+ Assert(is_skipping_changes());
+
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction %u",
+ skip_xid)));

Isn't it better to move this LOG at the end of this function? Because
clear* functions can give an error, so it is better to move it after
that. I have done that in the attached.

3.
+-- fail - must be superuser
+SET SESSION AUTHORIZATION 'regress_subscription_user2';
+ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
+ERROR:  must be owner of subscription regress_testsub

This test doesn't seem to be right. You want to get the error for the
superuser but the error is for the owner. I have changed this test to
do what it intends to do.

Apart from this, I have changed a few comments and ran pgindent. Do
let me know what you think of the changes?

Few things that I think we can improve in 028_skip_xact.pl are as follows:

After CREATE SUBSCRIPTION, wait for initial sync to be over and
two_phase state to be enabled. Please see 021_twophase. For the
streaming case, we might be able to ensure streaming even with lesser
data. Can you please try that?

-- 
With Regards,
Amit Kapila.

Вложения

v10-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patch

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

21 января 2022 г., 15:13:12

On Fri, Jan 21, 2022 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
>
> Few things that I think we can improve in 028_skip_xact.pl are as follows:
>
> After CREATE SUBSCRIPTION, wait for initial sync to be over and
> two_phase state to be enabled. Please see 021_twophase. For the
> streaming case, we might be able to ensure streaming even with lesser
> data. Can you please try that?
>

I noticed that the newly added test by this patch takes time is on the
upper side. See comparison with the subscription test that takes max
time:
[17:38:49] t/028_skip_xact.pl ................. ok     9298 ms
[17:38:59] t/100_bugs.pl ...................... ok    11349 ms

I think we can reduce time by removing some stream tests without much
impacting on coverage, possibly related to 2PC and streaming together,
and if you do that we probably don't need a subscription with both 2PC
and streaming enabled.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

21 января 2022 г., 16:53:39

On 21.01.22 04:08, Masahiko Sawada wrote:
>> I think the superuser check in AlterSubscription() might no longer be
>> appropriate.  Subscriptions can now be owned by non-superusers.  Please
>> check that.
> 
> IIUC we don't allow non-superuser to own the subscription yet. We
> still have the following superuser checks:
> 
> In CreateSubscription():
> 
>      if (!superuser())
>          ereport(ERROR,
>                  (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>                   errmsg("must be superuser to create subscriptions")));
> 
> and in AlterSubscriptionOwner_internal();
> 
>      /* New owner must be a superuser */
>      if (!superuser_arg(newOwnerId))
>          ereport(ERROR,
>                  (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
>                   errmsg("permission denied to change owner of
> subscription \"%s\"",
>                          NameStr(form->subname)),
>                   errhint("The owner of a subscription must be a superuser.")));
> 
> Also, doing superuser check here seems to be consistent with
> pg_replication_origin_advance() which is another way to skip
> transactions and also requires superuser permission.

I'm referring to commit a2ab9c06ea15fbcb2bfde570986a06b37f52bcca.  You 
still have to be superuser to create a subscription, but you can change 
the owner to a nonprivileged user and it will observe table permissions 
on the subscriber.

Assuming my understanding of that commit is correct, I think it would be 
sufficient in your patch to check that the current user is the owner of 
the subscription.

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

21 января 2022 г., 19:29:51

On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

Apart from this, I have changed a few comments and ran pgindent. Do
let me know what you think of the changes?

The paragraph describing ALTER SUBSCRIPTION SKIP seems unnecessarily repetitive. Consider:

"""

Skips applying all changes of the specified remote transaction, whose value should be obtained from pg_stat_subscription_workers.last_error_xid. While this will result in avoiding the last error on the subscription, thus allowing it to resume working. See "link to a more holistic description in the Logical Replication chapter" for alternative means of resolving subscription errors. Removing an entire transaction from the history of a table should be considered a last resort as it can leave the system in a very inconsistent state.

Note, this feature will not accept transactions prepared under two-phase commit.

This command sets pg_subscription.subskipxid field upon issuance and the system clears the same field upon seeing and successfully skipped the identified transaction. Issuing this command again while a skipped transaction is pending replaces the existing transaction with the new one.

"""

Then change the subskipxid column description to be:

"""

ID of the transaction whose changes are to be skipped. It is 0 when there are no pending skips. This is set by issuing ALTER SUBSCRIPTION SKIP and resets back to 0 when the identified transactions passes through the subscription stream and is successfully ignored.

"""

I don't understand why/how ", if a valid transaction ID;" comes into play (how would we know whether it is valid, or if we do ALTER SUBSCRIPTION SKIP should prohibit the invalid value from being chosen).

I'm against mentioning subtransactions in the skip_option description.

The Logical Replication page changes provide good content overall but I dislike going into detail about how to perform conflict resolution in the third paragraph and then summarize the various forms of conflict resolution in the newly added forth. Maybe re-work things like:

1. Logical replication behaves...

2. A conflict will produce...details can be found in places...

3. Resolving conflicts can be done by...

4. (split and reworded) If choosing to simply skip the offending transaction you take the pg_stat_subscription_worker.last_error_xid value (716 in the example above) and provide it while executing ALTER SUBSCRIPTION SKIP...

5. (split and reworded) Prior to v15 ALTER SUBSCRIPTION SKIP was not available and instead you had to use the pg_replication_origin_advance() function...

Don't just list out two options for the user to perform the same action. Tell a story about why we felt compelled to add ALTER SYSTEM SKIP and why either the function is now deprecated or is useful given different circumstances (the former seems likely).

David J.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

22 января 2022 г., 05:54:24

On Fri, Jan 21, 2022 at 7:23 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 21.01.22 04:08, Masahiko Sawada wrote:
> >> I think the superuser check in AlterSubscription() might no longer be
> >> appropriate.  Subscriptions can now be owned by non-superusers.  Please
> >> check that.
> >
> > IIUC we don't allow non-superuser to own the subscription yet. We
> > still have the following superuser checks:
> >
> > In CreateSubscription():
> >
> >      if (!superuser())
> >          ereport(ERROR,
> >                  (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> >                   errmsg("must be superuser to create subscriptions")));
> >
> > and in AlterSubscriptionOwner_internal();
> >
> >      /* New owner must be a superuser */
> >      if (!superuser_arg(newOwnerId))
> >          ereport(ERROR,
> >                  (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> >                   errmsg("permission denied to change owner of
> > subscription \"%s\"",
> >                          NameStr(form->subname)),
> >                   errhint("The owner of a subscription must be a superuser.")));
> >
> > Also, doing superuser check here seems to be consistent with
> > pg_replication_origin_advance() which is another way to skip
> > transactions and also requires superuser permission.
>
> I'm referring to commit a2ab9c06ea15fbcb2bfde570986a06b37f52bcca.  You
> still have to be superuser to create a subscription, but you can change
> the owner to a nonprivileged user and it will observe table permissions
> on the subscriber.
>
> Assuming my understanding of that commit is correct, I think it would be
> sufficient in your patch to check that the current user is the owner of
> the subscription.
>

Won't we already do that for Alter Subscription command which means
nothing special needs to be done for this? However, it seems to me
that the idea we are trying to follow here is that as this option can
lead to data inconsistency, it is good to allow only superusers to
specify this option. The owner of the subscription can be changed to
non-superuser as well in which case I think it won't be a good idea to
allow this option. OTOH, if we think it is okay to allow such an
option to users that don't have superuser privilege then I think
allowing it to the owner of the subscription makes sense to me.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

22 января 2022 г., 08:30:02

On Fri, Jan 21, 2022 at 10:00 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> Apart from this, I have changed a few comments and ran pgindent. Do
>> let me know what you think of the changes?
>
>
> The paragraph describing ALTER SUBSCRIPTION SKIP seems unnecessarily repetitive.  Consider:
> """
> Skips applying all changes of the specified remote transaction, whose value should be obtained from
pg_stat_subscription_workers.last_error_xid.
>

Here, you can also say that the value can be found from server logs as well.

>
  While this will result in avoiding the last error on the
subscription, thus allowing it to resume working.  See "link to a more
holistic description in the Logical Replication chapter" for
alternative means of resolving subscription errors.  Removing an
entire transaction from the history of a table should be considered a
last resort as it can leave the system in a very inconsistent state.
>
> Note, this feature will not accept transactions prepared under two-phase commit.
>
> This command sets pg_subscription.subskipxid field upon issuance and the system clears the same field upon seeing and
successfullyskipped the identified transaction.  Issuing this command again while a skipped transaction is pending
replacesthe existing transaction with the new one. 
> """
>

The proposed text sounds better to me except for a minor change as
suggested above.

> Then change the subskipxid column description to be:
> """
> ID of the transaction whose changes are to be skipped.  It is 0 when there are no pending skips.  This is set by
issuingALTER SUBSCRIPTION SKIP and resets back to 0 when the identified transactions passes through the subscription
streamand is successfully ignored. 
> """
>

Users can manually reset it by specifying NONE, so that should be
covered in the above text, otherwise, looks good.

> I don't understand why/how ", if a valid transaction ID;" comes into play (how would we know whether it is valid, or
ifwe do ALTER SUBSCRIPTION SKIP should prohibit the invalid value from being chosen). 
>

What do you mean by invalid value here? Is it the value lesser than
FirstNormalTransactionId or a value that is of the non-error
transaction? For the former, we already have a check in the patch and
for later we can't identify it with any certainty because the error
stats are collected by the stats collector.

> I'm against mentioning subtransactions in the skip_option description.
>

We have mentioned that because currently, we don't support it but in
the future one can come up with an idea to support it. What problem do
you see with it?

> The Logical Replication page changes provide good content overall but I dislike going into detail about how to
performconflict resolution in the third paragraph and then summarize the various forms of conflict resolution in the
newlyadded forth.  Maybe re-work things like: 
>
> 1. Logical replication behaves...
> 2. A conflict will produce...details can be found in places...
> 3. Resolving conflicts can be done by...
> 4. (split and reworded) If choosing to simply skip the offending transaction you take the
pg_stat_subscription_worker.last_error_xidvalue (716 in the example above) and provide it while executing ALTER
SUBSCRIPTIONSKIP... 
> 5. (split and reworded) Prior to v15 ALTER SUBSCRIPTION SKIP was not available and instead you had to use the
pg_replication_origin_advance()function... 
>
> Don't just list out two options for the user to perform the same action.  Tell a story about why we felt compelled to
addALTER SYSTEM SKIP and why either the function is now deprecated or is useful given different circumstances (the
formerseems likely). 
>

Personally, I don't see much value in the split (especially giving
context like "Prior to v15 ..) but specifying the circumstances where
each of the options could be useful.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

22 января 2022 г., 10:10:53

On Fri, Jan 21, 2022 at 10:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jan 21, 2022 at 10:00 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> Apart from this, I have changed a few comments and ran pgindent. Do
>> let me know what you think of the changes?
>
>
> The paragraph describing ALTER SUBSCRIPTION SKIP seems unnecessarily repetitive. Consider:
> """
> Skips applying all changes of the specified remote transaction, whose value should be obtained from pg_stat_subscription_workers.last_error_xid.
>

Here, you can also say that the value can be found from server logs as well.

subscriber's server logs, right? I would agree that adding that for completeness is warranted.

> Then change the subskipxid column description to be:
> """
> ID of the transaction whose changes are to be skipped. It is 0 when there are no pending skips. This is set by issuing ALTER SUBSCRIPTION SKIP and resets back to 0 when the identified transactions passes through the subscription stream and is successfully ignored.
> """
>

Users can manually reset it by specifying NONE, so that should be
covered in the above text, otherwise, looks good.

I agree with incorporating "reset" into the paragraph somehow - does not have to mention NONE, just that ALTER SUBSCRIPTION SKIP (not a family friendly abbreviation...) is what does it.

> I don't understand why/how ", if a valid transaction ID;" comes into play (how would we know whether it is valid, or if we do ALTER SUBSCRIPTION SKIP should prohibit the invalid value from being chosen).
>

What do you mean by invalid value here? Is it the value lesser than
FirstNormalTransactionId or a value that is of the non-error
transaction? For the former, we already have a check in the patch and
for later we can't identify it with any certainty because the error
stats are collected by the stats collector.

The original proposal qualifies the non-zero transaction id in subskipxid as being a "valid transaction ID" and that invalid ones (which is how "otherwise" is interpreted given the "valid" qualification preceding it) are shown as 0. As an end-user that makes me wonder what it means for a transaction ID to be invalid. My point is that dropping the mention of "valid transaction ID" avoids that and lets the reader operate with an understanding that things should "just work". If I see a non-zero in the column I have a pending skip and if I see zero I do not. My wording assumes it is that simple. If it isn't I would need some clarity as to why it is not in order to write something I could read and understand from my inexperienced user-centric point-of-view.

I get that I may provide a transaction ID that is invalid such that the system could never see it (or at least not for a long while) - say we error on transaction 102 and I typo it as 1002 or 101. But I would expect either an error where I make the typo or the numbers 1002 or 101 to appear on the table. I would not expect my 101 typo to result in a 0 appearing on the table (and if it does so today I argue that is a POLA violation). Thus, "if a valid transaction ID" from the original text just doesn't make sense to me.

In typical usage it would seem strange to allow a skip to be recorded if there is no existing error in the subscription. Should we (do we, haven't read the code) warn in that situation?

Or, why even force them to specify a number instead of just saying SKIP and if there is a current error we skip its transaction, otherwise we warn them that nothing happened because there is no last error.

Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction represented by last_error_xid has either been successfully processed or skipped. Does this "last error" stick around until another error happens (which is hopefully very rare) or does it reset to blanks? Seems like it should reset, which really makes this more of an "active_error" instead of a "last_error". This system is linear, we are stuck until this error is resolved, making it active.

> I'm against mentioning subtransactions in the skip_option description.
>

We have mentioned that because currently, we don't support it but in
the future one can come up with an idea to support it. What problem do
you see with it?

If you ever get around to implementing the feature then by all means add it. My main issue is that we basically never talk about subtransactions in the user-facing documentation and it doesn't seem desirable to do so here. Knowing that a whole transaction is skipped is all I need to care about as a user. I believe that no users will be asking "what about subtransactions (savepoints)" but by mentioning it less experienced ones will now have something to be curious about that they really do not need to be.

> The Logical Replication page changes provide good content overall but I dislike going into detail about how to perform conflict resolution in the third paragraph and then summarize the various forms of conflict resolution in the newly added forth. Maybe re-work things like:

Personally, I don't see much value in the split (especially giving
context like "Prior to v15 ..) but specifying the circumstances where
each of the options could be useful.

Yes, I've been reminded of the desire to avoid mentioning versions and agree doing so here is correct. The added context is desired, the style depends on the content.

David J.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

22 января 2022 г., 12:41:46

On Sat, Jan 22, 2022 at 12:41 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 10:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Fri, Jan 21, 2022 at 10:00 PM David G. Johnston
>> <david.g.johnston@gmail.com> wrote:
>> >
>> > On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >>
>> >> Apart from this, I have changed a few comments and ran pgindent. Do
>> >> let me know what you think of the changes?
>> >
>> >
>> > The paragraph describing ALTER SUBSCRIPTION SKIP seems unnecessarily repetitive.  Consider:
>> > """
>> > Skips applying all changes of the specified remote transaction, whose value should be obtained from
pg_stat_subscription_workers.last_error_xid.
>> >
>>
>> Here, you can also say that the value can be found from server logs as well.
>
>
> subscriber's server logs, right?
>

Right.

>  I would agree that adding that for completeness is warranted.
>
>>
>> > Then change the subskipxid column description to be:
>> > """
>> > ID of the transaction whose changes are to be skipped.  It is 0 when there are no pending skips.  This is set by
issuingALTER SUBSCRIPTION SKIP and resets back to 0 when the identified transactions passes through the subscription
streamand is successfully ignored. 
>> > """
>> >
>>
>> Users can manually reset it by specifying NONE, so that should be
>> covered in the above text, otherwise, looks good.
>
>
> I agree with incorporating "reset" into the paragraph somehow - does not have to mention NONE, just that ALTER
SUBSCRIPTIONSKIP (not a family friendly abbreviation...) is what does it. 
>

It is not clear to me what you have in mind here but to me in this
context saying "Setting <literal>NONE</literal> resets the transaction
ID." seems quite reasonable.

>>
>> > I don't understand why/how ", if a valid transaction ID;" comes into play (how would we know whether it is valid,
orif we do ALTER SUBSCRIPTION SKIP should prohibit the invalid value from being chosen). 
>> >
>>
>> What do you mean by invalid value here? Is it the value lesser than
>> FirstNormalTransactionId or a value that is of the non-error
>> transaction? For the former, we already have a check in the patch and
>> for later we can't identify it with any certainty because the error
>> stats are collected by the stats collector.
>
>
> The original proposal qualifies the non-zero transaction id in subskipxid as being a "valid transaction ID" and that
invalidones (which is how "otherwise" is interpreted given the "valid" qualification preceding it) are shown as 0.  As
anend-user that makes me wonder what it means for a transaction ID to be invalid.  My point is that dropping the
mentionof "valid transaction ID" avoids that and lets the reader operate with an understanding that things should "just
work". If I see a non-zero in the column I have a pending skip and if I see zero I do not.  My wording assumes it is
thatsimple.  If it isn't I would need some clarity as to why it is not in order to write something I could read and
understandfrom my inexperienced user-centric point-of-view. 
>
> I get that I may provide a transaction ID that is invalid such that the system could never see it (or at least not
fora long while) - say we error on transaction 102 and I typo it as 1002 or 101.  But I would expect either an error
whereI make the typo or the numbers 1002 or 101 to appear on the table.  I would not expect my 101 typo to result in a
0appearing on the table (and if it does so today I argue that is a POLA violation).  Thus, "if a valid transaction ID"
fromthe original text just doesn't make sense to me. 
>
> In typical usage it would seem strange to allow a skip to be recorded if there is no existing error in the
subscription. Should we (do we, haven't read the code) warn in that situation? 
>

Yeah, we will error in that situation. The only invalid values are
system reserved values (1,2).

> Or, why even force them to specify a number instead of just saying SKIP and if there is a current error we skip its
transaction,otherwise we warn them that nothing happened because there is no last error. 
>

The idea is that we might extend this feature to skip specific
operations on relations or maybe by having other identifiers. One idea
we discussed was to automatically fetch the last error xid but then
decided it can be done as a later patch.

> Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction
representedby last_error_xid has either been successfully processed or skipped.  Does this "last error" stick around
untilanother error happens (which is hopefully very rare) or does it reset to blanks? 
>

It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.

>  Seems like it should reset, which really makes this more of an "active_error" instead of a "last_error".  This
systemis linear, we are stuck until this error is resolved, making it active. 
>
>>
>> > I'm against mentioning subtransactions in the skip_option description.
>> >
>>
>> We have mentioned that because currently, we don't support it but in
>> the future one can come up with an idea to support it. What problem do
>> you see with it?
>
>
> If you ever get around to implementing the feature then by all means add it.  My main issue is that we basically
nevertalk about subtransactions in the user-facing documentation and it doesn't seem desirable to do so here.  Knowing
thata whole transaction is skipped is all I need to care about as a user.  I believe that no users will be asking "what
aboutsubtransactions (savepoints)" but by mentioning it less experienced ones will now have something to be curious
aboutthat they really do not need to be. 
>

It is not that we don't mention subtransactions in the docs but I see
your point and I think we can avoid mentioning it in this case.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

22 января 2022 г., 19:21:24

On Sat, Jan 22, 2022 at 2:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Sat, Jan 22, 2022 at 12:41 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 10:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Fri, Jan 21, 2022 at 10:00 PM David G. Johnston
>> <david.g.johnston@gmail.com> wrote:
>> >
>> > On Fri, Jan 21, 2022 at 4:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >>

>
> I agree with incorporating "reset" into the paragraph somehow - does not have to mention NONE, just that ALTER SUBSCRIPTION SKIP (not a family friendly abbreviation...) is what does it.
>

It is not clear to me what you have in mind here but to me in this
context saying "Setting <literal>NONE</literal> resets the transaction
ID." seems quite reasonable.

Yeah, we will error in that situation. The only invalid values are
system reserved values (1,2).

So long as the ALTER command errors when asked to skip those IDs there isn't any reason for an end-user, who likely doesn't know or care that 1 and 2 are special, to be concerned about them (the only two invalid values) while reading the docs.

> Or, why even force them to specify a number instead of just saying SKIP and if there is a current error we skip its transaction, otherwise we warn them that nothing happened because there is no last error.
>

The idea is that we might extend this feature to skip specific
operations on relations or maybe by having other identifiers.

Again, you've already got syntax reserved that lets you add more features to this command in the future; and removing warnings or errors because new features make them moot is easy. Lets document and code what we are willing to implement today. A single top-level transaction xid that is presently blocking the worker from applying any more WAL.

One idea
we discussed was to automatically fetch the last error xid but then
decided it can be done as a later patch.

This seems backwards. The user-friendly approach is to not make them type in anything at all. That said, this particular UX seems like it could use some safety. Thus I would propose at this time that attempting to set the skip_option to anything but THE active_error_xid for the named subscription results in an error. Once you add new features the user can set the skip_option to other things without provoking errors. Again, I consider this a safety feature since the user now has to accurately match the xid to the name in the SQL in order to perform a successful skip - and the to-be affected transaction has to be one that is preventing replication from moving forward. I'm not interested in providing a foot-gun where an arbitrary future transaction can be scheduled to be skipped. Running the command twice with the same values should provoke an error since the first run should be allowed to finish (?). Also, we handle the situation where the state of the worker changes between when the user saw the error and wrote down the xid to skip and the actual execution of the alter command. Maybe not highly anticipated scenarios but this is an easy win to deal with them.

> Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction represented by last_error_xid has either been successfully processed or skipped. Does this "last error" stick around until another error happens (which is hopefully very rare) or does it reset to blanks?
>

It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.

I really dislike the user experience this provides, and given it is new in v15 (and right now this table seems to exist solely to support this feature) changing this seems within the realm of possibility. I have to imagine these workers have a sense of local state that would just be "no errors, no need to touch pg_stat_subscription_workers at the end of this transaction's commit". It would save a local state of the error_xid and if a successfully committed transaction has that xid it would clear the error. The skip code path would also check for and see the matching xid value and clear the error. Even if the local state thing doesn't work, one catalog lookup per transaction seems like potentially reasonable overhead to incur here.

David J.

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

22 января 2022 г., 19:47:14

On Sat, Jan 22, 2022 at 9:21 AM David G. Johnston <david.g.johnston@gmail.com> wrote:

On Sat, Jan 22, 2022 at 2:41 AM Amit Kapila <amit.kapila16@gmail.com> wrote:

> Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction represented by last_error_xid has either been successfully processed or skipped. Does this "last error" stick around until another error happens (which is hopefully very rare) or does it reset to blanks?
>

It will be reset only on subscription drop, otherwise, it will stick
around until another error happens.

I really dislike the user experience this provides, and given it is new in v15 (and right now this table seems to exist solely to support this feature) changing this seems within the realm of possibility. I have to imagine these workers have a sense of local state that would just be "no errors, no need to touch pg_stat_subscription_workers at the end of this transaction's commit". It would save a local state of the error_xid and if a successfully committed transaction has that xid it would clear the error. The skip code path would also check for and see the matching xid value and clear the error. Even if the local state thing doesn't work, one catalog lookup per transaction seems like potentially reasonable overhead to incur here.

It shouldn't even need to be that overhead intensive. Once an error is encountered the system stops. By construction it must be told to redo, at which point the information about "last error" is no longer relevant and can be removed (for skipping the user/system will have already done everything with the xid that is needed before the redo is issued). In the steady-state it then is simply empty until a new error arises at which point it becomes populated again; and stays that way until the system goes into redo mode as instructed by the user via one of several methods.

David J.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

24 января 2022 г., 05:53:26

On Fri, Jan 21, 2022 at 9:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> >
> > Few things that I think we can improve in 028_skip_xact.pl are as follows:
> >
> > After CREATE SUBSCRIPTION, wait for initial sync to be over and
> > two_phase state to be enabled. Please see 021_twophase. For the
> > streaming case, we might be able to ensure streaming even with lesser
> > data. Can you please try that?
> >
>
> I noticed that the newly added test by this patch takes time is on the
> upper side. See comparison with the subscription test that takes max
> time:
> [17:38:49] t/028_skip_xact.pl ................. ok     9298 ms
> [17:38:59] t/100_bugs.pl ...................... ok    11349 ms
>
> I think we can reduce time by removing some stream tests without much
> impacting on coverage, possibly related to 2PC and streaming together,
> and if you do that we probably don't need a subscription with both 2PC
> and streaming enabled.

Agreed.

In addition to that, after some tests, I realized that the two tests
of ROLLBACK PREPARED are not stable. If the walsender detects a
concurrent abort of the transaction that is being decoded, it’s
possible that it sends only beigin_prepare and prepare messages, and
consequently. If this happens before setting skip_xid, a unique key
constraint violation doesn’t occur on the subscription, and
consequently, skip_xid is not cleared. We can reduce the possibility
by setting a very high value to  wal_retrieve_retry_interval but I
think it’s better to remove them. What do you think?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

24 января 2022 г., 05:57:59

On Fri, Jan 21, 2022 at 8:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Jan 21, 2022 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > What do we want to indicate by [, ... ]? To me, it appears like
> > > multiple options but that is not what we support currently.
> >
> > You're right. It's an oversight.
> >
>
> I have fixed this and a few other things in the attached patch.

Thank you for updating the patch!

> 1.
> The newly added column needs to be updated in the following statement:
> -- All columns of pg_subscription except subconninfo are publicly readable.
> REVOKE ALL ON pg_subscription FROM public;
> GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
>               substream, subtwophasestate, subslotname, subsynccommit,
> subpublications)
>     ON pg_subscription TO public;
>
> 2.
> +stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
> +   TimestampTz origin_timestamp)
> +{
> + Assert(is_skipping_changes());
> +
> + ereport(LOG,
> + (errmsg("done skipping logical replication transaction %u",
> + skip_xid)));
>
> Isn't it better to move this LOG at the end of this function? Because
> clear* functions can give an error, so it is better to move it after
> that. I have done that in the attached.
>
> 3.
> +-- fail - must be superuser
> +SET SESSION AUTHORIZATION 'regress_subscription_user2';
> +ALTER SUBSCRIPTION regress_testsub SKIP (xid = 100);
> +ERROR:  must be owner of subscription regress_testsub
>
> This test doesn't seem to be right. You want to get the error for the
> superuser but the error is for the owner. I have changed this test to
> do what it intends to do.
>
> Apart from this, I have changed a few comments and ran pgindent. Do
> let me know what you think of the changes?

Agree with these changes.

>
> Few things that I think we can improve in 028_skip_xact.pl are as follows:
>
> After CREATE SUBSCRIPTION, wait for initial sync to be over and
> two_phase state to be enabled. Please see 021_twophase.

Agreed.

> For the
> streaming case, we might be able to ensure streaming even with lesser
> data. Can you please try that?

Yeah, after some tests, it's enough to insert 500 rows as follows:

INSERT INTO test_tab_streaming SELECT i, md5(i::text) FROM
generate_series(1, 500) s(i);

I've just sent another email about that probably we can remove two
tests for ROLLBACK PREPARED, so I’ll update the patch while including
this point.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

24 января 2022 г., 06:06:38

On Mon, Jan 24, 2022 at 8:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jan 21, 2022 at 9:13 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jan 21, 2022 at 5:25 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jan 21, 2022 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > >
> > > Few things that I think we can improve in 028_skip_xact.pl are as follows:
> > >
> > > After CREATE SUBSCRIPTION, wait for initial sync to be over and
> > > two_phase state to be enabled. Please see 021_twophase. For the
> > > streaming case, we might be able to ensure streaming even with lesser
> > > data. Can you please try that?
> > >
> >
> > I noticed that the newly added test by this patch takes time is on the
> > upper side. See comparison with the subscription test that takes max
> > time:
> > [17:38:49] t/028_skip_xact.pl ................. ok     9298 ms
> > [17:38:59] t/100_bugs.pl ...................... ok    11349 ms
> >
> > I think we can reduce time by removing some stream tests without much
> > impacting on coverage, possibly related to 2PC and streaming together,
> > and if you do that we probably don't need a subscription with both 2PC
> > and streaming enabled.
>
> Agreed.
>
> In addition to that, after some tests, I realized that the two tests
> of ROLLBACK PREPARED are not stable. If the walsender detects a
> concurrent abort of the transaction that is being decoded, it’s
> possible that it sends only beigin_prepare and prepare messages, and
> consequently. If this happens before setting skip_xid, a unique key
> constraint violation doesn’t occur on the subscription, and
> consequently, skip_xid is not cleared. We can reduce the possibility
> by setting a very high value to  wal_retrieve_retry_interval but I
> think it’s better to remove them.
>

+1.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

24 января 2022 г., 06:34:58

On Sat, Jan 22, 2022 at 9:51 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> So long as the ALTER command errors when asked to skip those IDs there isn't any reason for an end-user, who likely
doesn'tknow or care that 1 and 2 are special, to be concerned about them (the only two invalid values) while reading
thedocs. 
>

In this matter, I don't see any problem with the current text proposed
and there are many others who have also reviewed it. I am fine to
change if others also think that the current text needs to be changed.

>>
>> > Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction
representedby last_error_xid has either been successfully processed or skipped.  Does this "last error" stick around
untilanother error happens (which is hopefully very rare) or does it reset to blanks? 
>> >
>>
>> It will be reset only on subscription drop, otherwise, it will stick
>> around until another error happens.
>
>
> I really dislike the user experience this provides, and given it is new in v15 (and right now this table seems to
existsolely to support this feature) changing this seems within the realm of possibility. I have to imagine these
workershave a sense of local state that would just be "no errors, no need to touch pg_stat_subscription_workers at the
endof this transaction's commit".  It would save a local state of the error_xid and if a successfully committed
transactionhas that xid it would clear the error.  The skip code path would also check for and see the matching xid
valueand clear the error.  Even if the local state thing doesn't work, one catalog lookup per transaction seems like
potentiallyreasonable overhead to incur here. 
>

Are you telling to update the catalog to save error_xid when an error
occurs? If so, that has many challenges like we are not supposed to
perform any such operations when the transaction is in an error state.
We have discussed this and other ideas in the beginning. I don't find
any of your arguments convincing to change the basic approach here but
I would like to see what others think on this matter?

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

24 января 2022 г., 07:48:48

On Sun, Jan 23, 2022 at 8:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

> I really dislike the user experience this provides, and given it is new in v15 (and right now this table seems to exist solely to support this feature) changing this seems within the realm of possibility. I have to imagine these workers have a sense of local state that would just be "no errors, no need to touch pg_stat_subscription_workers at the end of this transaction's commit". It would save a local state of the error_xid and if a successfully committed transaction has that xid it would clear the error. The skip code path would also check for and see the matching xid value and clear the error. Even if the local state thing doesn't work, one catalog lookup per transaction seems like potentially reasonable overhead to incur here.
>

Are you telling to update the catalog to save error_xid when an error
occurs? If so, that has many challenges like we are not supposed to
perform any such operations when the transaction is in an error state.
We have discussed this and other ideas in the beginning. I don't find
any of your arguments convincing to change the basic approach here but
I would like to see what others think on this matter?

Then how does the table get updated to that state in the first place since it doesn't know the error details until there is an error?

In any case, clearing out the entries in the table would not happen while it is applying the replication stream, in an error state or otherwise.

in = while streaming

out = not streaming

1(in). replication stream is working

2(in). replication stream fails; capture error information

3(in->out). stop replication stream; perform rollback on xid

4(out). update pg_stat_subscription_worker to report the failure, including xid of the transaction

5(out). wait for the user to manually restart the replication stream

[if they do so by skipping the xid, save the xid from pg_stat_subscription_worker into pg_subscription.subskipxid - possibly requiring the user to confirm the xid]

[user has now done their thing and requested that the replication stream resume]

6(out). clear the error information from pg_stat_subscription_worker; it is no longer useful/doesn't exist because the user just took action to avoid that very error, one way (skipping its transaction) or another.

7(out->in). resume the replication stream, return to step 1

You are already doing steps 1-5 and 7 today however you are forced to deal with transactions and catalog access. I am just adding step 6, which turns last_error_xid into current_error_xid because it is current value of the error in the stream during step 5 when the user needs to decide how to recover from the error. Once the user decides and the stream resumes that error information has no value (go look in the logs if you want history). Thus when 7 comes around and the stream is restarted the error info in pg_stat_subscription_worker is empty waiting for the next error to happen. If the user did nothing in step 5 then when that same wal is replayed at step 2 the error will come back.

The main thing is how many ways can the user exit step 5 and to make sure that no matter which way they exit step 6 happens before step 7.

David J.

RE: Skipping logical replication transactions on subscriber side

От

"tanghy.fnst@fujitsu.com"

Дата:

24 января 2022 г., 08:55:27

On Fri, Jan 21, 2022 7:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> 2.
> +stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
> +   TimestampTz origin_timestamp)
> +{
> + Assert(is_skipping_changes());
> +
> + ereport(LOG,
> + (errmsg("done skipping logical replication transaction %u",
> + skip_xid)));
> 
> Isn't it better to move this LOG at the end of this function? Because
> clear* functions can give an error, so it is better to move it after
> that. I have done that in the attached.
> 

+    /* Stop skipping changes */
+    skip_xid = InvalidTransactionId;
+
+    ereport(LOG,
+            (errmsg("done skipping logical replication transaction %u",
+                    skip_xid)));


I think we can move the LOG before resetting skip_xid, otherwise skip_xid would
always be 0 in the LOG.

Regards,
Tang

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

24 января 2022 г., 09:54:35

On Mon, Jan 24, 2022 at 1:49 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Sun, Jan 23, 2022 at 8:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> > I really dislike the user experience this provides, and given it is new in v15 (and right now this table seems to
existsolely to support this feature) changing this seems within the realm of possibility. I have to imagine these
workershave a sense of local state that would just be "no errors, no need to touch pg_stat_subscription_workers at the
endof this transaction's commit".  It would save a local state of the error_xid and if a successfully committed
transactionhas that xid it would clear the error.  The skip code path would also check for and see the matching xid
valueand clear the error.  Even if the local state thing doesn't work, one catalog lookup per transaction seems like
potentiallyreasonable overhead to incur here. 
>> >
>>
>> Are you telling to update the catalog to save error_xid when an error
>> occurs? If so, that has many challenges like we are not supposed to
>> perform any such operations when the transaction is in an error state.
>> We have discussed this and other ideas in the beginning. I don't find
>> any of your arguments convincing to change the basic approach here but
>> I would like to see what others think on this matter?
>>
>
> Then how does the table get updated to that state in the first place since it doesn't know the error details until
thereis an error? 

I think your idea is based on storing error information including XID
is stored in the system catalog. I think that the reasons why we use
the stats collector to store error information including
last_error_xid are (1) as Amit mentioned, it would have many
challenges if updating the catalog when the transaction is in an error
state, and (2) we can store more information such as error messages,
action, etc. other than XID so that users can identify that the
reported error is a conflict error but not other types of error such
as OOM error. For these reasons to me, it makes sense to store
subscribers' error information by using the stats collector.

When it comes to reporting a message to the stats collector, we need
to note that it's not guaranteed that all messages arrive at the stats
collector. Therefore, last_error_xid doesn't not necessarily get
updated after the worker reports an error. Similarly, the same is true
for clearing subskipxid. I agree that it's useful if
pg_subscription.subskipxid is automatically set when executing ALTER
SUBSCRIPTION SKIP but it might not work in some cases because of this
restriction.

There is another idea of storing error XID on shmem (e.g., in
ReplicationState) in addition to reporting error details to the stats
collector and using the XID when skipping the transaction, but I'm not
sure whether it's a reliable way.

Anyway, even if subskipxid is automatically set when ALTER
SUBSCRIPTION SKIP, I think we need to provide a way to clear it as the
current patch does (setting NONE) just in case.

>
> In any case, clearing out the entries in the table would not happen while it is applying the replication stream, in
anerror state or otherwise. 
>
> in = while streaming
> out = not streaming
>
> 1(in). replication stream is working
> 2(in). replication stream fails; capture error information
> 3(in->out). stop replication stream; perform rollback on xid
> 4(out). update pg_stat_subscription_worker to report the failure, including xid of the transaction
> 5(out). wait for the user to manually restart the replication stream

Do you mean that there always is user intervention after error so the
replication stream can resume?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

24 января 2022 г., 10:59:54

On Sun, Jan 23, 2022 at 11:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Jan 24, 2022 at 1:49 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Sun, Jan 23, 2022 at 8:35 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> > I really dislike the user experience this provides, and given it is new in v15 (and right now this table seems to exist solely to support this feature) changing this seems within the realm of possibility. I have to imagine these workers have a sense of local state that would just be "no errors, no need to touch pg_stat_subscription_workers at the end of this transaction's commit". It would save a local state of the error_xid and if a successfully committed transaction has that xid it would clear the error. The skip code path would also check for and see the matching xid value and clear the error. Even if the local state thing doesn't work, one catalog lookup per transaction seems like potentially reasonable overhead to incur here.
>> >
>>
>> Are you telling to update the catalog to save error_xid when an error
>> occurs? If so, that has many challenges like we are not supposed to
>> perform any such operations when the transaction is in an error state.
>> We have discussed this and other ideas in the beginning. I don't find
>> any of your arguments convincing to change the basic approach here but
>> I would like to see what others think on this matter?
>>
>
> Then how does the table get updated to that state in the first place since it doesn't know the error details until there is an error?

I think your idea is based on storing error information including XID
is stored in the system catalog. I think that the reasons why we use
the stats collector

I noticed this dynamic while skimming the patch (and also pondering why the new worker table was not in a catalog chapter) but am only now fully beginning to appreciate its impact on this discussion.

to store error information including

last_error_xid are (1) as Amit mentioned, it would have many
challenges if updating the catalog when the transaction is in an error
state, and

I'm going on faith right now that this is a problem. But from my prior outline I hope you can see why I find it surprising. Don't try to update a catalog while in an error state. Get out of the error state first. e.g., A transient "holding pattern" would seem to work. Upon a server restart the transient state would be forgotten, it would attempt to reapply the wal, would see the same error, and would then go back into the transient holding pattern. I do intend to read the other discussion on this particular topic so a detailed rebuttal, if warranted, can be withheld.

(2) we can store more information such as error messages,
action, etc. other than XID so that users can identify that the
reported error is a conflict error but not other types of error such
as OOM error.

I mentioned only XID because of the focus on SKIP. The other data already present in that table is ok. Whether we use a catalog or the stats collector seems irrelevant. If anything the catalog makes more sense - calling an error message a statistic is a bit of a reach.

>Similarly, the same is true
>for clearing subskipxid. I agree that it's useful if
>pg_subscription.subskipxid is automatically set when executing ALTER
>SUBSCRIPTION SKIP but it might not work in some cases because of this

>restriction. For these reasons to me, it makes sense to store

>subscribers' error information by using the stats collector.

I'm confused - pg_subscription is a catalog, not a stat view. Why is it affected?

I don't see how point 2 prevents using a system catalog. I accept point 1 as true but will need to read some of the prior discussion to really understand it.

When it comes to reporting a message to the stats collector, we need
to note that it's not guaranteed that all messages arrive at the stats
collector. Therefore, last_error_xid doesn't not necessarily get
updated after the worker reports an error.

You'll forgive me for not considering this due to its apparent lack of mention in the documentation [*] and it's arguable classification as a POLA violation.

[*] https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-SUBSCRIPTION

What I do read there seems compatible with the desired user experience. 500ms lag, idle transaction oriented, reset upon unclean shutdown, and consumers seeing a stable transactional view: none of these seem like show-stoppers.

Anyway, even if subskipxid is automatically set when ALTER
SUBSCRIPTION SKIP, I think we need to provide a way to clear it as the
current patch does (setting NONE) just in case.

With my suggestion of requiring a matching xid the whole option for skip_xid = { xid | NONE } remains.

> 5(out). wait for the user to manually restart the replication stream

Do you mean that there always is user intervention after error so the
replication stream can resume?

That is my working assumption. It doesn't seem like the system would auto-resume without a DBA doing something (I'll attribute a server crash to the DBA for convenience).

Apparently I need to read more about how the system works today to understand how this varies from and integrates with today's user experience.

That said, at present my two dislikes:

1) ALTER SYSTEM SKIP accepts any xid value (I need to consider further the timing of when this resets to zero)

2) pg_stat_subscription_worker.last_error_* fields remain populated even while the system is in a normal operating state.

are preventing me from preferring this patch over the status quo (yes, I know the 2nd point is about a committed feature). Regardless of how far off I may be regarding our technical ability to change them to a more (IMO) user-friendly design.

David J.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

24 января 2022 г., 12:42:58

On Mon, Jan 24, 2022 at 1:30 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> That said, at present my two dislikes:
>
> 1) ALTER SYSTEM SKIP accepts any xid value (I need to consider further the timing of when this resets to zero)
>

I think this is required for future extension of this feature wherein
I think there could be multiple such xids say when we support parallel
apply workers. I think if we get a good way to do it even after the
first version like by making a xid an optional parameter.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

24 января 2022 г., 12:54:20

On Mon, Jan 24, 2022 at 5:00 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Sun, Jan 23, 2022 at 11:55 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
> >Similarly, the same is true
> >for clearing subskipxid.
>
> I'm confused - pg_subscription is a catalog, not a stat view.  Why is it affected?

Sorry, I mistook last_error_xid for subskipxid here.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

24 января 2022 г., 17:06:11

On 22.01.22 03:54, Amit Kapila wrote:
> Won't we already do that for Alter Subscription command which means
> nothing special needs to be done for this? However, it seems to me
> that the idea we are trying to follow here is that as this option can
> lead to data inconsistency, it is good to allow only superusers to
> specify this option. The owner of the subscription can be changed to
> non-superuser as well in which case I think it won't be a good idea to
> allow this option. OTOH, if we think it is okay to allow such an
> option to users that don't have superuser privilege then I think
> allowing it to the owner of the subscription makes sense to me.

I don't think this functionality allows a nonprivileged user to do 
anything they couldn't otherwise do.  You can create inconsistent data 
in the sense that you can choose not to apply certain replicated data. 
But a subscription owner has to have write access to the target tables 
of the subscription, so they already have the ability to write or not 
write any data they want.

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

24 января 2022 г., 17:10:35

On 22.01.22 10:41, Amit Kapila wrote:
>> Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction
representedby last_error_xid has either been successfully processed or skipped.  Does this "last error" stick around
untilanother error happens (which is hopefully very rare) or does it reset to blanks?
 
>>
> It will be reset only on subscription drop, otherwise, it will stick
> around until another error happens.

Is this going to be a problem with transaction ID wraparound?  Do we 
need to use 64-bit xids for this?

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

24 января 2022 г., 17:42:34

On Monday, January 24, 2022, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jan 24, 2022 at 1:30 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> That said, at present my two dislikes:
>
> 1) ALTER SYSTEM SKIP accepts any xid value (I need to consider further the timing of when this resets to zero)
>

I think this is required for future extension of this feature wherein
I think there could be multiple such xids say when we support parallel
apply workers. I think if we get a good way to do it even after the
first version like by making a xid an optional parameter.

Extending the behavior is doable, and maybe we end up without this limitation in the future, so be it. But I’m having a hard time imagining a scenario where the xid is not already known to the system, and the user, and wants to be in effect for a very short window.

David J.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

25 января 2022 г., 05:54:45

On Mon, Jan 24, 2022 at 7:36 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 22.01.22 03:54, Amit Kapila wrote:
> > Won't we already do that for Alter Subscription command which means
> > nothing special needs to be done for this? However, it seems to me
> > that the idea we are trying to follow here is that as this option can
> > lead to data inconsistency, it is good to allow only superusers to
> > specify this option. The owner of the subscription can be changed to
> > non-superuser as well in which case I think it won't be a good idea to
> > allow this option. OTOH, if we think it is okay to allow such an
> > option to users that don't have superuser privilege then I think
> > allowing it to the owner of the subscription makes sense to me.
>
> I don't think this functionality allows a nonprivileged user to do
> anything they couldn't otherwise do.  You can create inconsistent data
> in the sense that you can choose not to apply certain replicated data.
>

I thought this will be the only primary way to skip applying certain
transactions. The other could be via pg_replication_origin_advance().
Or are you talking about the case where we skip applying update/delete
where the corresponding rows are not found?

I see the point that if we can allow the owner to skip applying
updates/deletes in certain cases then probably this should also be
okay. Kindly let us know if you have something else in mind as well?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

25 января 2022 г., 08:18:11

On Mon, Jan 24, 2022 at 7:40 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 22.01.22 10:41, Amit Kapila wrote:
> >> Additionally, the description for pg_stat_subscription_workers should describe what happens once the transaction
representedby last_error_xid has either been successfully processed or skipped.  Does this "last error" stick around
untilanother error happens (which is hopefully very rare) or does it reset to blanks? 
> >>
> > It will be reset only on subscription drop, otherwise, it will stick
> > around until another error happens.
>
> Is this going to be a problem with transaction ID wraparound?
>

I think to avoid this we can send a message to clear this (at least to
clear XID in the view) after skipping the xact but there is no
guarantee that it will be received by the stats collector.
Additionally, the worker can periodically (say after every N (100,
500, etc) successful transaction) send a clear message after
successful apply. This will ensure that eventually the error entry
will be cleared.

>  Do we
> need to use 64-bit xids for this?
>

For 64-bit XIds, as this reported XID is for the remote transactions,
I think we need to add 4-bytes to each transaction message(say Begin)
and that could be costly for small transactions. We also probably need
to make logical decoding aware of 64-bit XID? Note that XIDs in WAL
records are still 32-bit XID. I don't think this feature deserves such
a big (in terms of WAL and network message size) change.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

25 января 2022 г., 15:48:38

On 25.01.22 03:54, Amit Kapila wrote:
>> I don't think this functionality allows a nonprivileged user to do
>> anything they couldn't otherwise do.  You can create inconsistent data
>> in the sense that you can choose not to apply certain replicated data.
>>
> I thought this will be the only primary way to skip applying certain
> transactions. The other could be via pg_replication_origin_advance().
> Or are you talking about the case where we skip applying update/delete
> where the corresponding rows are not found?
> 
> I see the point that if we can allow the owner to skip applying
> updates/deletes in certain cases then probably this should also be
> okay. Kindly let us know if you have something else in mind as well?

Let's start this again: The question at hand is whether ALTER 
SUBSCRIPTION ... SKIP should be allowed for subscription owners that are 
not superusers.  The argument raised against that was that this would 
allow the owner to create "inconsistent" data.  But it hasn't been 
explained what that actually means or why it is dangerous.

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

25 января 2022 г., 15:52:03

On 25.01.22 06:18, Amit Kapila wrote:
> I think to avoid this we can send a message to clear this (at least to
> clear XID in the view) after skipping the xact but there is no
> guarantee that it will be received by the stats collector.
> Additionally, the worker can periodically (say after every N (100,
> 500, etc) successful transaction) send a clear message after
> successful apply. This will ensure that eventually the error entry
> will be cleared.

Well, I think we need *some* solution for now.  We can't leave a footgun 
where you say, "skip transaction 700", somehow transaction 700 doesn't 
happen, the whole thing gets forgotten, but then 3 months later, the 
next transaction 700 mysteriously gets dropped.

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

25 января 2022 г., 17:35:32

On Tue, Jan 25, 2022 at 5:52 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:

On 25.01.22 06:18, Amit Kapila wrote:
> I think to avoid this we can send a message to clear this (at least to
> clear XID in the view) after skipping the xact but there is no
> guarantee that it will be received by the stats collector.
> Additionally, the worker can periodically (say after every N (100,
> 500, etc) successful transaction) send a clear message after
> successful apply. This will ensure that eventually the error entry
> will be cleared.

Well, I think we need *some* solution for now. We can't leave a footgun
where you say, "skip transaction 700", somehow transaction 700 doesn't
happen, the whole thing gets forgotten, but then 3 months later, the
next transaction 700 mysteriously gets dropped.

This is indeed part of why I feel that the xid being skipped should be validated. As the feature is presented the user is supposed to read the xid from the system (the new stat view or the error log) and supply it and then the worker, when it goes to skip, should find that the very first transaction xid it encounters is the one it is being told to skip. It skips that transaction, clears the skipxid, and puts the system back into normal operating mode. If that first transaction xid isn't the one being specified to skip the worker should error with "skipping transaction failed, xid 123 expected but 456 found".

This whole lack of a guarantee of the availability and accuracy regarding the data that this process should be reliant upon needs to be engineered away.

David J.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

25 января 2022 г., 17:47:04

On Tue, Jan 25, 2022 at 11:35 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 5:52 AM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote:
>>
>> On 25.01.22 06:18, Amit Kapila wrote:
>> > I think to avoid this we can send a message to clear this (at least to
>> > clear XID in the view) after skipping the xact but there is no
>> > guarantee that it will be received by the stats collector.
>> > Additionally, the worker can periodically (say after every N (100,
>> > 500, etc) successful transaction) send a clear message after
>> > successful apply. This will ensure that eventually the error entry
>> > will be cleared.
>>
>> Well, I think we need *some* solution for now.  We can't leave a footgun
>> where you say, "skip transaction 700", somehow transaction 700 doesn't
>> happen, the whole thing gets forgotten, but then 3 months later, the
>> next transaction 700 mysteriously gets dropped.
>
>
> This is indeed part of why I feel that the xid being skipped should be validated.  As the feature is presented the
useris supposed to read the xid from the system (the new stat view or the error log) and supply it and then the worker,
whenit goes to skip, should find that the very first transaction xid it encounters is the one it is being told to skip.
It skips that transaction, clears the skipxid, and puts the system back into normal operating mode.  If that first
transactionxid isn't the one being specified to skip the worker should error with "skipping transaction failed, xid 123
expectedbut 456 found". 

Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

25 января 2022 г., 17:58:00

On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Yeah, I think it's a good idea to clear the subskipxid after the first
transaction regardless of whether the worker skipped it.

So basically instead of stopping the worker with an error you suggest having the worker continue applying changes (after resetting subskipxid, and - arguably - the ?_error_* fields). Log the transaction xid mis-match as a warning in the log file as opposed to an error.

I was supposing to make it an error and have the worker stop again since in a system where the xid is verified and the code is bug-free I would expect the situation to be a "can't happen" one and I'd rather error in that circumstance than warn. The DBA will have to go and ALTER SUBSCRIPTION SKIP (xid = NONE) to get the worker working again but I find that acceptable in this case.

David J.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

25 января 2022 г., 18:08:34

On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> Yeah, I think it's a good idea to clear the subskipxid after the first
>> transaction regardless of whether the worker skipped it.
>>
>
> So basically instead of stopping the worker with an error you suggest having the worker continue applying changes
(afterresetting subskipxid, and - arguably - the ?_error_* fields).  Log the transaction xid mis-match as a warning in
thelog file as opposed to an error. 

Agreed, I think it's better to log a warning than to raise an error.
In the case where the user specified the wrong XID, the worker should
fail again due to the same error.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

25 января 2022 г., 18:14:00

On Tue, Jan 25, 2022 at 8:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> Yeah, I think it's a good idea to clear the subskipxid after the first
>> transaction regardless of whether the worker skipped it.
>>
>
> So basically instead of stopping the worker with an error you suggest having the worker continue applying changes (after resetting subskipxid, and - arguably - the ?_error_* fields). Log the transaction xid mis-match as a warning in the log file as opposed to an error.

Agreed, I think it's better to log a warning than to raise an error.
In the case where the user specified the wrong XID, the worker should
fail again due to the same error.

If it remains possible for the system to accept a wrongly specified XID I would agree that this behavior is preferable. At least when the user wonders why the skip didn't work and they are seeing the same error again they will have a log entry warning telling them their XID choice was incorrect. I would prefer that the system not accept a wrongly specified XID and the user be told directly and sooner that their XID choice was incorrect.

David J.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

25 января 2022 г., 18:32:43

On Wed, Jan 26, 2022 at 12:14 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
>
> On Tue, Jan 25, 2022 at 8:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
>> <david.g.johnston@gmail.com> wrote:
>> >
>> > On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> >>
>> >> Yeah, I think it's a good idea to clear the subskipxid after the first
>> >> transaction regardless of whether the worker skipped it.
>> >>
>> >
>> > So basically instead of stopping the worker with an error you suggest having the worker continue applying changes
(afterresetting subskipxid, and - arguably - the ?_error_* fields).  Log the transaction xid mis-match as a warning in
thelog file as opposed to an error. 
>>
>> Agreed, I think it's better to log a warning than to raise an error.
>> In the case where the user specified the wrong XID, the worker should
>> fail again due to the same error.
>>
>
> If it remains possible for the system to accept a wrongly specified XID I would agree that this behavior is
preferable. At least when the user wonders why the skip didn't work and they are seeing the same error again they will
havea log entry warning telling them their XID choice was incorrect. 

Yes.

>  I would prefer that the system not accept a wrongly specified XID and the user be told directly and sooner that
theirXID choice was incorrect. 

Given that we cannot use rely on the pg_stat_subscription_workers view
for this purpose, we would need either a new sub-system that tracks
each logical replication status so the system can set the error XID to
subskipxid, or to wait for shared-memory based stats collector. While
agreeing that ideally, we need such a sub-system I'm concerned that
everyone will agree to add complexity for this feature. That having
been said, if there is a significant need for it, we can implement it
as an improvement.



Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

26 января 2022 г., 01:05:22

On Tue, Jan 25, 2022 at 8:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Given that we cannot use rely on the pg_stat_subscription_workers view
for this purpose, we would need either a new sub-system that tracks
each logical replication status so the system can set the error XID to
subskipxid, or to wait for shared-memory based stats collector.

I'm reading over the monitoring-stats page to try and get my head around all of this. First of all, it defines two kinds of views:

1. PostgreSQL's statistics collector is a subsystem that supports collection and reporting of information about server activity.

2. PostgreSQL also supports reporting dynamic information ... This facility is independent of the collector process.

In then has two tables:

28.1 Dynamic Statistics Views (describing #2 above)

28.2 Collected Statistics Views (describing #1 above)

Apparently the "collector process" is UDP-like, not reliable. The documentation fails to mention this fact. I'd argue that this is a documentation bug.

I do see that the pg_stat_subscription_workers view is correctly placed in Table 28.2

Reviewing the other views listed in that table only pg_stat_archiver abuses the statistics collector in a similar fashion. All of the others are actually metric oriented.

I don't care for the specification: "will contain one row per subscription worker on which errors have occurred, for workers applying logical replication changes and workers handling the initial data copy of the subscribed tables."

I would much rather have this behave similar to pg_stat_activity (which, of course, is a Dynamic Statistics View...) in that it shows only and all workers that are presently working. The tablesync workers should go away when they have finished synchronizing. I should not have to manually intervene to get rid of unreliable expired data. The log file feels like a superior solution to this monitoring view.

Alternatively, if the tablesync workers are done but we've been accumulating real statistics for them, then by all means keep them included in the view - but regardless of whether they encountered an error. But maybe the view can right join in pg_stat_subscription as show a column for "(pid is not null) AS is_active".

Maybe we need to add a track_finished_tablesync_workers GUC so the DBA can decide whether to devote storage and processing resources to that historical information.

If you had kept the original view name, "pg_stat_subscription_error", this whole issue goes away. But you decided to make it more generic and call it "pg_stat_subscription_workers" - which means you need to get rid of the error-specific condition in the WHERE clause for the view. Show all workers - I can filter on is_active. Showing only active workers is also acceptable. You won't get to change your mind so decide whether this wants to show only current and running state or whether historical statistics for now defunct tablesync workers are desired. Personally, I would just show active workers and if someone wants to add the feature they can add a track_tablesync_worker_stats GUC and a matching view.

From that, every apply worker should be sending a statistics message to the collector periodically. If error info is not present and the state is "all is well", clear out any existing error info from the view. The attempt to include an actual statistic field here doesn't seem useful nor redeeming. I would add a "state" field in its place (well, after subrelid). And I would still rename the columns to current_error_* and note that these should be null unless the status field shows error (there may be some additional complexity here). Just get rid of last_error_count.

David J.

P.S. I saw the discussion regarding pg_dump'ing the subskipid field. I didn't notice any discussion around creating and restoring a basebackup. It seems like during server startup subskipid should just be cleared out. Then it doesn't matter what one does during backup.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

26 января 2022 г., 04:09:06

On Tue, Jan 25, 2022 at 6:18 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 25.01.22 03:54, Amit Kapila wrote:
> >> I don't think this functionality allows a nonprivileged user to do
> >> anything they couldn't otherwise do.  You can create inconsistent data
> >> in the sense that you can choose not to apply certain replicated data.
> >>
> > I thought this will be the only primary way to skip applying certain
> > transactions. The other could be via pg_replication_origin_advance().
> > Or are you talking about the case where we skip applying update/delete
> > where the corresponding rows are not found?
> >
> > I see the point that if we can allow the owner to skip applying
> > updates/deletes in certain cases then probably this should also be
> > okay. Kindly let us know if you have something else in mind as well?
>
> Let's start this again: The question at hand is whether ALTER
> SUBSCRIPTION ... SKIP should be allowed for subscription owners that are
> not superusers.  The argument raised against that was that this would
> allow the owner to create "inconsistent" data.  But it hasn't been
> explained what that actually means or why it is dangerous.
>

There are two reasons in my mind: (a) We are going to skip some
unrelated data changes that are not the direct cause of conflict
because of the entire transaction skip. Now, it is possible that
unintentionally it allows skipping some actual changes
insert/update/delete/truncate to some relations which will then allow
even the future changes to cause some conflict or won't get applied. A
few examples are after TRUNCATE is skipped, the INSERTS in following
transactions can cause error "duplicate key .."; similarly say some
INSERT is skipped, then following UPDATE/DELETE won't find the
corresponding row to perform the operation. (b) Users can specify some
random XID, the discussion below is trying to detect this and raise
WARNING/ERROR but still, it could cause some valid transaction (which
won't generate any conflict/error) to skip.

These can lead to some missing data in the subscriber which the user
might not have expected.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

26 января 2022 г., 05:01:24

On Mon, Jan 24, 2022 at 12:59 AM David G. Johnston <david.g.johnston@gmail.com> wrote:

> 5(out). wait for the user to manually restart the replication stream

Do you mean that there always is user intervention after error so the
replication stream can resume?

That is my working assumption. It doesn't seem like the system would auto-resume without a DBA doing something (I'll attribute a server crash to the DBA for convenience).

Apparently I need to read more about how the system works today to understand how this varies from and integrates with today's user experience.

I've done some code reading. My understanding is that a background worker for the main apply of a given subscription is created from the launcher code (not reviewed) which is initialized at server startup (or as needed sometime thereafter). This goes into a for(;;) loop in LogicalRepApplyLoop under a PG_TRY in ApplyWorkerMain. When a message is applied that provokes an error the PG_CATCH() in ApplyWorkerMain takes over and then this worker dies. While in that PG_CATCH() we have an aborted transaction and so are limited in what we can change. We PG_RE_THROW(); back to the background worker infrastructure and let it perform logging and cleanup; which includes this destroying this instance of the background worker. The background worker that is destroyed is replaced and its replacement is identical to the original so far as the statistics collector is concerned.

I haven't traced out when the replacement apply worker gets recreated. It seems like doing so immediately, and then it going and just encountering the same error, would be an undesirable choice, and so I've assumed it does not. But I also wasn't expecting the apply worker to PG_RE_THROW() either, but instead continue on running in a different for(;;) loop waiting for some signal from the system that something has changed that may avoid the error that put it in timeout.

So my more detailed goal would be to get rid of PG_RE_THROW(); (I assume doing so would entail transaction rollback) and stay in the worker. Update pg_subscription with the error information (having removed PG_RE_THROW we have new things to consider re: pg_stat_subscription_workers). Go into a for(;;) loop, maybe polling pg_subscription for an indication that it is OK to retry applying the last transaction. (can an inter-process signal be sent from a normal backend process to a background worker process?). The SKIP command then matches XID values on pg_subscription; the resumption sees the subskipxid, updates pg_subscription to remove the error info and subskipid, skips the next transaction assuming it has the matching XID, and then continues applying as normal. Adapt to deal with crash conditions as needed though clearing before reapplying seems like a safe default. Again, upon worker startup maybe they should be cleared too (making pg_dump and other backup considerations moot - as noted in my P.S. in the previous email).

I'm not sure we are paranoid enough regarding the locking of pg_subscription for purposes of reading and writing subskipxid. I'd probably rather serialize access to it, and maybe even not allow changing from one non-zero XID to another non-zero XID. It shouldn't be needed in practice (moreso if the XID has to be the one that is present from current_error_xid) and the user can always reset first.

In worker.c I was and still am confused as to the meaning of 'c' and 'w' in LogicalRepApplyLoop. In apply_dispatch in that file enums are used to compare against the message byte, it would be helpful for the inexperienced reader if 'c' and 'w' were done as enums instead as well.

David J.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

26 января 2022 г., 05:28:01

On Tue, Jan 25, 2022 at 8:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
> <david.g.johnston@gmail.com> wrote:
> >
> > On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >>
> >> Yeah, I think it's a good idea to clear the subskipxid after the first
> >> transaction regardless of whether the worker skipped it.
> >>
> >
> > So basically instead of stopping the worker with an error you suggest having the worker continue applying changes
(afterresetting subskipxid, and - arguably - the ?_error_* fields).  Log the transaction xid mis-match as a warning in
thelog file as opposed to an error. 
>
> Agreed, I think it's better to log a warning than to raise an error.
> In the case where the user specified the wrong XID, the worker should
> fail again due to the same error.
>

IIUC, the proposal is to compare the skip_xid with the very
transaction the apply worker received to apply and raise a warning if
it doesn't match with skip_xid and then continue. This seems like a
reasonable idea but can we guarantee that it is always the first
transaction that we want to skip? We seem to guarantee that we won't
get something again once it is written durably/flushed on the
subscriber side. I guess here it can happen that before the errored
transaction, there is some empty xact, or maybe part of the stream
(consider streaming transactions) of some xact, or there could be
other cases as well where the server will send those xacts again.

Now, if the above reasoning is correct then I think your proposal to
clear the skip_xid in the catalog as soon as we have applied the first
transaction successfully seems reasonable to me.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

26 января 2022 г., 05:38:13

On Wed, Jan 26, 2022 at 7:31 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Mon, Jan 24, 2022 at 12:59 AM David G. Johnston <david.g.johnston@gmail.com> wrote:
>>
>
> So my more detailed goal would be to get rid of PG_RE_THROW();
>

I don't think that will be possible, consider the FATAL/PANIC error
case. Also, there are reasons why we always restart apply worker on
ERROR even without this work. If we want to change that, we might need
to redesign the apply side mechanism which I don't think we should try
to do as part of this patch.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

26 января 2022 г., 05:51:40

On Wed, Jan 26, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 8:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
> > <david.g.johnston@gmail.com> wrote:
> > >
> > > On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >>
> > >> Yeah, I think it's a good idea to clear the subskipxid after the first
> > >> transaction regardless of whether the worker skipped it.
> > >>
> > >
> > > So basically instead of stopping the worker with an error you suggest having the worker continue applying changes
(afterresetting subskipxid, and - arguably - the ?_error_* fields).  Log the transaction xid mis-match as a warning in
thelog file as opposed to an error. 
> >
> > Agreed, I think it's better to log a warning than to raise an error.
> > In the case where the user specified the wrong XID, the worker should
> > fail again due to the same error.
> >
>
> IIUC, the proposal is to compare the skip_xid with the very
> transaction the apply worker received to apply and raise a warning if
> it doesn't match with skip_xid and then continue. This seems like a
> reasonable idea but can we guarantee that it is always the first
> transaction that we want to skip? We seem to guarantee that we won't
> get something again once it is written durably/flushed on the
> subscriber side. I guess here it can happen that before the errored
> transaction, there is some empty xact, or maybe part of the stream
> (consider streaming transactions) of some xact, or there could be
> other cases as well where the server will send those xacts again.

Good point.

I guess that in the situation the worker entered an error loop, we can
guarantee that the worker fails while applying the first non-empty
transaction since starting logical replication. And the transaction is
what we’d like to skip. If the transaction that can be applied without
an error is resent after a restart, it’s a problem of logical
replication. As you pointed out, it's possible that there are some
empty transactions before the transaction in question since we don't
advance replication origin LSN if the transaction is empty. Also,
probably the same is true for a streamed transaction that is rolled
back or ROLLBACK-PREPARED transactions. So, we can also skip clearing
subskipxid if the transaction is empty? That is, we make sure to clear
it after applying the first non-empty transaction. We would need to
carefully think about this solution otherwise ALTER SUBSCRIPTION SKIP
ends up not working at all in some cases.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

26 января 2022 г., 06:25:21

On Wed, Jan 26, 2022 at 11:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 26, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jan 25, 2022 at 8:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Jan 25, 2022 at 11:58 PM David G. Johnston
> > > <david.g.johnston@gmail.com> wrote:
> > > >
> > > > On Tue, Jan 25, 2022 at 7:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >>
> > > >> Yeah, I think it's a good idea to clear the subskipxid after the first
> > > >> transaction regardless of whether the worker skipped it.
> > > >>
> > > >
> > > > So basically instead of stopping the worker with an error you suggest having the worker continue applying
changes(after resetting subskipxid, and - arguably - the ?_error_* fields).  Log the transaction xid mis-match as a
warningin the log file as opposed to an error. 
> > >
> > > Agreed, I think it's better to log a warning than to raise an error.
> > > In the case where the user specified the wrong XID, the worker should
> > > fail again due to the same error.
> > >
> >
> > IIUC, the proposal is to compare the skip_xid with the very
> > transaction the apply worker received to apply and raise a warning if
> > it doesn't match with skip_xid and then continue. This seems like a
> > reasonable idea but can we guarantee that it is always the first
> > transaction that we want to skip? We seem to guarantee that we won't
> > get something again once it is written durably/flushed on the
> > subscriber side. I guess here it can happen that before the errored
> > transaction, there is some empty xact, or maybe part of the stream
> > (consider streaming transactions) of some xact, or there could be
> > other cases as well where the server will send those xacts again.
>
> Good point.
>
> I guess that in the situation the worker entered an error loop, we can
> guarantee that the worker fails while applying the first non-empty
> transaction since starting logical replication. And the transaction is
> what we’d like to skip. If the transaction that can be applied without
> an error is resent after a restart, it’s a problem of logical
> replication. As you pointed out, it's possible that there are some
> empty transactions before the transaction in question since we don't
> advance replication origin LSN if the transaction is empty. Also,
> probably the same is true for a streamed transaction that is rolled
> back or ROLLBACK-PREPARED transactions. So, we can also skip clearing
> subskipxid if the transaction is empty? That is, we make sure to clear
> it after applying the first non-empty transaction. We would need to
> carefully think about this solution otherwise ALTER SUBSCRIPTION SKIP
> ends up not working at all in some cases.

Probably, we also need to consider the case where the tablesync worker
entered an error loop and the user wants to skip the transaction? The
apply worker is also running at the same time but it should not clear
subskipxid. Similarly, the tablesync worker should not clear
subskipxid if the apply worker wants to skip the transaction.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

26 января 2022 г., 06:54:42

On Wed, Jan 26, 2022 at 8:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 26, 2022 at 11:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jan 26, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > IIUC, the proposal is to compare the skip_xid with the very
> > > transaction the apply worker received to apply and raise a warning if
> > > it doesn't match with skip_xid and then continue. This seems like a
> > > reasonable idea but can we guarantee that it is always the first
> > > transaction that we want to skip? We seem to guarantee that we won't
> > > get something again once it is written durably/flushed on the
> > > subscriber side. I guess here it can happen that before the errored
> > > transaction, there is some empty xact, or maybe part of the stream
> > > (consider streaming transactions) of some xact, or there could be
> > > other cases as well where the server will send those xacts again.
> >
> > Good point.
> >
> > I guess that in the situation the worker entered an error loop, we can
> > guarantee that the worker fails while applying the first non-empty
> > transaction since starting logical replication. And the transaction is
> > what we’d like to skip. If the transaction that can be applied without
> > an error is resent after a restart, it’s a problem of logical
> > replication. As you pointed out, it's possible that there are some
> > empty transactions before the transaction in question since we don't
> > advance replication origin LSN if the transaction is empty. Also,
> > probably the same is true for a streamed transaction that is rolled
> > back or ROLLBACK-PREPARED transactions. So, we can also skip clearing
> > subskipxid if the transaction is empty? That is, we make sure to clear
> > it after applying the first non-empty transaction. We would need to
> > carefully think about this solution otherwise ALTER SUBSCRIPTION SKIP
> > ends up not working at all in some cases.

I think it is okay to clear after the first successful application of
any transaction. What I was not sure was about the idea of giving
WARNING/ERROR if the first xact to be applied is not the same as
skip_xid.

>
> Probably, we also need to consider the case where the tablesync worker
> entered an error loop and the user wants to skip the transaction? The
> apply worker is also running at the same time but it should not clear
> subskipxid. Similarly, the tablesync worker should not clear
> subskipxid if the apply worker wants to skip the transaction.
>

I think for tablesync workers, the skip_xid set via this mechanism
won't work as we don't have any remote_xid for them, and neither any
XID is reported in the view for them.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

26 января 2022 г., 07:05:56

On Wed, Jan 26, 2022 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jan 26, 2022 at 8:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jan 26, 2022 at 11:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Jan 26, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > IIUC, the proposal is to compare the skip_xid with the very
> > > > transaction the apply worker received to apply and raise a warning if
> > > > it doesn't match with skip_xid and then continue. This seems like a
> > > > reasonable idea but can we guarantee that it is always the first
> > > > transaction that we want to skip? We seem to guarantee that we won't
> > > > get something again once it is written durably/flushed on the
> > > > subscriber side. I guess here it can happen that before the errored
> > > > transaction, there is some empty xact, or maybe part of the stream
> > > > (consider streaming transactions) of some xact, or there could be
> > > > other cases as well where the server will send those xacts again.
> > >
> > > Good point.
> > >
> > > I guess that in the situation the worker entered an error loop, we can
> > > guarantee that the worker fails while applying the first non-empty
> > > transaction since starting logical replication. And the transaction is
> > > what we’d like to skip. If the transaction that can be applied without
> > > an error is resent after a restart, it’s a problem of logical
> > > replication. As you pointed out, it's possible that there are some
> > > empty transactions before the transaction in question since we don't
> > > advance replication origin LSN if the transaction is empty. Also,
> > > probably the same is true for a streamed transaction that is rolled
> > > back or ROLLBACK-PREPARED transactions. So, we can also skip clearing
> > > subskipxid if the transaction is empty? That is, we make sure to clear
> > > it after applying the first non-empty transaction. We would need to
> > > carefully think about this solution otherwise ALTER SUBSCRIPTION SKIP
> > > ends up not working at all in some cases.
>
> I think it is okay to clear after the first successful application of
> any transaction. What I was not sure was about the idea of giving
> WARNING/ERROR if the first xact to be applied is not the same as
> skip_xid.

Do you prefer not to do anything in this case?

>
> >
> > Probably, we also need to consider the case where the tablesync worker
> > entered an error loop and the user wants to skip the transaction? The
> > apply worker is also running at the same time but it should not clear
> > subskipxid. Similarly, the tablesync worker should not clear
> > subskipxid if the apply worker wants to skip the transaction.
> >
>
> I think for tablesync workers, the skip_xid set via this mechanism
> won't work as we don't have any remote_xid for them, and neither any
> XID is reported in the view for them.

If the tablesync worker raises an error while applying changes after
finishing the copy, it also reports the error XID.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

26 января 2022 г., 07:10:00

On Wed, Jan 26, 2022 at 7:05 AM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 8:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>> Given that we cannot use rely on the pg_stat_subscription_workers view
>> for this purpose, we would need either a new sub-system that tracks
>> each logical replication status so the system can set the error XID to
>> subskipxid, or to wait for shared-memory based stats collector.
>
>
> I'm reading over the monitoring-stats page to try and get my head around all of this.  First of all, it defines two
kindsof views: 
>
> 1. PostgreSQL's statistics collector is a subsystem that supports collection and reporting of information about
serveractivity. 
> 2. PostgreSQL also supports reporting dynamic information ... This facility is independent of the collector process.
>
> In then has two tables:
>
> 28.1 Dynamic Statistics Views (describing #2 above)
> 28.2 Collected Statistics Views (describing #1 above)
>
> Apparently the "collector process" is UDP-like, not reliable.  The documentation fails to mention this fact.  I'd
arguethat this is a documentation bug. 
>
> I do see that the pg_stat_subscription_workers view is correctly placed in Table 28.2
>
> Reviewing the other views listed in that table only pg_stat_archiver abuses the statistics collector in a similar
fashion. All of the others are actually metric oriented. 
>
> I don't care for the specification: "will contain one row per subscription worker on which errors have occurred, for
workersapplying logical replication changes and workers handling the initial data copy of the subscribed tables." 
>
> I would much rather have this behave similar to pg_stat_activity (which, of course, is a Dynamic Statistics View...)
inthat it shows only and all workers that are presently working. 

I have no objection against having a dynamic statistics view showing
the status of each running worker but I think it should be implemented
in a separate view and not be something that replaces the
pg_stat_subscription_workers. I think pg_stat_subscription would be
the right place for it.

> The tablesync workers should go away when they have finished synchronizing.  I should not have to manually intervene
toget rid of unreliable expired data.  The log file feels like a superior solution to this monitoring view. 
>
> Alternatively, if the tablesync workers are done but we've been accumulating real statistics for them, then by all
meanskeep them included in the view - but regardless of whether they encountered an error.  But maybe the view can
rightjoin in pg_stat_subscription as show a column for "(pid is not null) AS is_active". 
>
> Maybe we need to add a track_finished_tablesync_workers GUC so the DBA can decide whether to devote storage and
processingresources to that historical information. 
>
> If you had kept the original view name, "pg_stat_subscription_error", this whole issue goes away.  But you decided to
makeit more generic and call it "pg_stat_subscription_workers" - which means you need to get rid of the error-specific
conditionin the WHERE clause for the view.  Show all workers - I can filter on is_active.  Showing only active workers
isalso acceptable.  You won't get to change your mind so decide whether this wants to show only current and running
stateor whether historical statistics for now defunct tablesync workers are desired.  Personally, I would just show
activeworkers and if someone wants to add the feature they can add a track_tablesync_worker_stats GUC and a matching
view.

We plan to clear/remove table sync entries who finished synchronization.

It’s better not to merge dynamic statistics such as pid and is_active
and accumulative statistics into one view. I think we can have both
views: pg_stat_subscription_workers view with some changes based on
the review comments (e.g., removing defunct tablesync entry), and
another view showing dynamic statistics such as the worker status.

> From that, every apply worker should be sending a statistics message to the collector periodically.  If error info is
notpresent and the state is "all is well", clear out any existing error info from the view.  The attempt to include an
actualstatistic field here doesn't seem useful nor redeeming.  I would add a "state" field in its place (well, after
subrelid). And I would still rename the columns to current_error_* and note that these should be null unless the status
fieldshows error (there may be some additional complexity here).  Just get rid of last_error_count. 
>

I don't think that using the stats collector to show the current
status of each worker is a good idea because of 500ms lag, UDP
connection etc. Even if error info is not present and the state is
good according to the view, it might be out-of-date or simply not
true. If we want to do that, it’s much better to prepare something on
shmem so each worker can store its status (running or error, error
xid, etc.) and have pg_stat_subscription (or another view)  show the
information. One thing we need to consider is that it needs to leave
the status even after exiting apply/tablesync worker but we don't know
how many statuses for workers we need to allocate on the shmem at
startup time.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

26 января 2022 г., 07:15:51

On Wed, Jan 26, 2022 at 9:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 26, 2022 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > I think it is okay to clear after the first successful application of
> > any transaction. What I was not sure was about the idea of giving
> > WARNING/ERROR if the first xact to be applied is not the same as
> > skip_xid.
>
> Do you prefer not to do anything in this case?
>

I am fine with clearing the skip_xid after the first successful
application. But note, we shouldn't do catalog access for this, we can
check if it is set in MySubscription.

> >
> > >
> > > Probably, we also need to consider the case where the tablesync worker
> > > entered an error loop and the user wants to skip the transaction? The
> > > apply worker is also running at the same time but it should not clear
> > > subskipxid. Similarly, the tablesync worker should not clear
> > > subskipxid if the apply worker wants to skip the transaction.
> > >
> >
> > I think for tablesync workers, the skip_xid set via this mechanism
> > won't work as we don't have any remote_xid for them, and neither any
> > XID is reported in the view for them.
>
> If the tablesync worker raises an error while applying changes after
> finishing the copy, it also reports the error XID.
>

Right and agreed with your assessment for the same.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

"David G. Johnston"

Дата:

26 января 2022 г., 07:43:37

On Tue, Jan 25, 2022 at 9:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jan 26, 2022 at 9:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Wed, Jan 26, 2022 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

> > >
> > > Probably, we also need to consider the case where the tablesync worker
> > > entered an error loop and the user wants to skip the transaction? The
> > > apply worker is also running at the same time but it should not clear
> > > subskipxid. Similarly, the tablesync worker should not clear
> > > subskipxid if the apply worker wants to skip the transaction.
> > >
> >
> > I think for tablesync workers, the skip_xid set via this mechanism
> > won't work as we don't have any remote_xid for them, and neither any
> > XID is reported in the view for them.
>
> If the tablesync worker raises an error while applying changes after
> finishing the copy, it also reports the error XID.
>

Right and agreed with your assessment for the same.

IIUC each tablesync process also performs an apply stage but only applies the messages related to the single table it is responsible for. Once all tablesync workers synchronize they are all destroyed and the main apply worker takes over and applies transactions to all subscribed tables.

We probably should just provide an option for the user to specify "subrelid". If null, only the main apply worker will skip the given xid, otherwise only the worker tasked with syncing that particular table will do so. It might take a sequence of ALTER SUBSCRIPTION SET commands to get a broken initial table synchronization to load completely but at least there will not be any surprises as to which tables had transactions skipped and which did not.

It may even make sense, eventually for the main apply worker to skip on a subrelid basis. Since the main apply worker isn't applying transactions at the same time as the tablesync workers the non-null subrelid can also be interpreted by the main apply worker.

David J.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

26 января 2022 г., 10:21:00

On Wed, Jan 26, 2022 at 1:43 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Tue, Jan 25, 2022 at 9:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Wed, Jan 26, 2022 at 9:36 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> > On Wed, Jan 26, 2022 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> > > >
>> > > > Probably, we also need to consider the case where the tablesync worker
>> > > > entered an error loop and the user wants to skip the transaction? The
>> > > > apply worker is also running at the same time but it should not clear
>> > > > subskipxid. Similarly, the tablesync worker should not clear
>> > > > subskipxid if the apply worker wants to skip the transaction.
>> > > >
>> > >
>> > > I think for tablesync workers, the skip_xid set via this mechanism
>> > > won't work as we don't have any remote_xid for them, and neither any
>> > > XID is reported in the view for them.
>> >
>> > If the tablesync worker raises an error while applying changes after
>> > finishing the copy, it also reports the error XID.
>> >
>>
>> Right and agreed with your assessment for the same.
>>
>
> IIUC each tablesync process also performs an apply stage but only applies the messages related to the single table it
isresponsible for.  Once all tablesync workers synchronize they are all destroyed and the main apply worker takes over
andapplies transactions to all subscribed tables. 
>
> We probably should just provide an option for the user to specify "subrelid".  If null, only the main apply worker
willskip the given xid, otherwise only the worker tasked with syncing that particular table will do so.  It might take
asequence of ALTER SUBSCRIPTION SET commands to get a broken initial table synchronization to load completely but at
leastthere will not be any surprises as to which tables had transactions skipped and which did not. 

That would work but I’m concerned that the users can specify it
properly. Also, we would need to change the errcontext message
generated by apply_error_callback() so the user can know that the
error occurred in either apply worker or tablesync worker.

Or, as another idea, since an error during table synchronization is
not common and could be resolved by truncating the table and
restarting the synchronization in practice, there might be no need
this much and we can support it only for apply worker errors.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

26 января 2022 г., 14:02:41

On Wed, Jan 26, 2022 at 12:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jan 26, 2022 at 1:43 PM David G. Johnston
> <david.g.johnston@gmail.com> wrote:
> >
> > We probably should just provide an option for the user to specify "subrelid".  If null, only the main apply worker
willskip the given xid, otherwise only the worker tasked with syncing that particular table will do so.  It might take
asequence of ALTER SUBSCRIPTION SET commands to get a broken initial table synchronization to load completely but at
leastthere will not be any surprises as to which tables had transactions skipped and which did not. 
>
> That would work but I’m concerned that the users can specify it
> properly. Also, we would need to change the errcontext message
> generated by apply_error_callback() so the user can know that the
> error occurred in either apply worker or tablesync worker.
>
> Or, as another idea, since an error during table synchronization is
> not common and could be resolved by truncating the table and
> restarting the synchronization in practice, there might be no need
> this much and we can support it only for apply worker errors.
>

Yes, that is what I have also in mind. We can always extend this
feature for tablesync process because it can not only fail for the
specified skip_xid but also for many other reasons during the initial
copy.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

27 января 2022 г., 08:51:38

On Wed, Jan 26, 2022 at 8:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jan 26, 2022 at 12:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jan 26, 2022 at 1:43 PM David G. Johnston
> > <david.g.johnston@gmail.com> wrote:
> > >
> > > We probably should just provide an option for the user to specify "subrelid".  If null, only the main apply
workerwill skip the given xid, otherwise only the worker tasked with syncing that particular table will do so.  It
mighttake a sequence of ALTER SUBSCRIPTION SET commands to get a broken initial table synchronization to load
completelybut at least there will not be any surprises as to which tables had transactions skipped and which did not. 
> >
> > That would work but I’m concerned that the users can specify it
> > properly. Also, we would need to change the errcontext message
> > generated by apply_error_callback() so the user can know that the
> > error occurred in either apply worker or tablesync worker.
> >
> > Or, as another idea, since an error during table synchronization is
> > not common and could be resolved by truncating the table and
> > restarting the synchronization in practice, there might be no need
> > this much and we can support it only for apply worker errors.
> >
>
> Yes, that is what I have also in mind. We can always extend this
> feature for tablesync process because it can not only fail for the
> specified skip_xid but also for many other reasons during the initial
> copy.

I'll update the patch accordingly to test and verify this approach.

In the meantime, I’d like to discuss the possible ideas of storing the
error XID somewhere the worker can see it even after a restart. It has
been proposed that the worker updates the catalog when an error
occurs, which was criticized as updating the catalog in such a
situation is not a good idea.

The next idea I considered was to store the error XID somewhere on
shmem (e.g., ReplicationState). But It requires entries at least as
much as subscriptions in principle, not
max_logical_replcation_workers. Since we don’t know it at startup
time, we need to use DSM or cache with a fixed number of entries. It
seems overkill to me.

The third idea, which is slightly better than others, is to update the
catalog by the launcher process, not the worker process; when an error
occurs, the apply worker stores the error XID (and maybe its
subscription OID) into its LogicalRepWorker entry, and the launcher
updates the corresponding entry of pg_subscription catalog before
launching workers. After the worker restarts, it clears the error XID
on the catalog if it successfully applied the transaction with the
error XID. The user can enable the skipping transaction behavior by a
query say ALTER SUBSCRIPTION SKIP ENABLED. The user cannot enable the
skipping behavior if the error XID is not set. If the skipping
behavior is enabled and the error XID is a valid value, the worker
skips the transaction and then clears both the error XID and a flag of
skipping behavior on the catalog.

With this idea, we don’t need a complex mechanism to store the error
XID for each subscription and can ensure to skip only the transaction
in question. But my concern is that the launcher updates the catalog.
Since it doesn’t connect to any database, probably it cannot open the
catalog indexes (because it requires lookup pg_class). Therefore, we
have to use in-place updates here. Through quick tests, I’ve confirmed
that using heap_inplace_update() to update the error XID on
pg_subscription tuples seems to work but not sure using an in-place
update here is a legitimate approach.

What do you think and any ideas?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

27 января 2022 г., 16:42:19

On 26.01.22 05:05, Masahiko Sawada wrote:
>> I think it is okay to clear after the first successful application of
>> any transaction. What I was not sure was about the idea of giving
>> WARNING/ERROR if the first xact to be applied is not the same as
>> skip_xid.
> Do you prefer not to do anything in this case?

I think a warning would be sensible.  If the user specifies to skip a 
certain transaction and then that doesn't happen, we should at least say 
something.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

11 февраля 2022 г., 05:09:38

On Thu, Jan 27, 2022 at 10:42 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 26.01.22 05:05, Masahiko Sawada wrote:
> >> I think it is okay to clear after the first successful application of
> >> any transaction. What I was not sure was about the idea of giving
> >> WARNING/ERROR if the first xact to be applied is not the same as
> >> skip_xid.
> > Do you prefer not to do anything in this case?
>
> I think a warning would be sensible.  If the user specifies to skip a
> certain transaction and then that doesn't happen, we should at least say
> something.

Meanwhile waiting for comments on the discussion about the designs of
both pg_stat_subscription_workers and ALTER SUBSCRIPTION SKIP feature,
I’ve incorporated some (minor) comments on the current design patch,
which includes:

* Use LSN instead of XID.
* Raise a warning if the user specifies to skip a certain transaction
and then that doesn’t happen.
* Skip-LSN has an effect on the first non-empty transaction. That is,
it’s cleared after successfully committing a non-empty transaction,
preventing the user-specified wrong LSN to remain.
* Remove some unnecessary tap tests to reduce the test time.

I think we all agree with the first point regardless of where we store
error information. And speaking of the current design, I think we all
agree on other points. Since the design discussion is ongoing, I’ll
incorporate other comments according to the result of the discussion.

The attached 0001 patch modifies the pg_stat_subscription_workers to
report LSN instead of XID, which is required by ALTER SUBSCRIPTION
SKIP patch, the 0002 patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Tue, Feb 15, 2022 at 7:35 PM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 14.02.22 10:16, Amit Kapila wrote:
> > I think exposing LSN is a better approach as it doesn't have the
> > dangers of wraparound. And, I think users can use it with the existing
> > function pg_replication_origin_advance() which will save us from
> > adding additional code for this feature. We can explain/expand in docs
> > how users can use the error information from view/error_logs and use
> > the existing function to skip conflicting transactions. We might want
> > to even expose error_origin to make it a bit easier for users but not
> > sure. I feel the need for the new syntax (and then added code
> > complexity due to that) isn't warranted if we expose error_LSN and let
> > users use it with the existing functions.
>
> Well, the whole point of this feature is to provide a higher-level
> interface instead of pg_replication_origin_advance().  Replication
> origins are currently not something the users have to deal with
> directly.  We already document that you can use
> pg_replication_origin_advance() to skip erroring transactions.  But that
> seems unsatisfactory.  It'd be like using pg_surgery to fix unique
> constraint violations.

+1

I’ve considered a plan for the skipping logical replication
transaction feature toward PG15. Several ideas and patches have been
proposed here and another related thread[1][2] for the skipping
logical replication transaction feature as follows:

A. Change pg_stat_subscription_workers (committed 7a8507329085)
B. Add origin name and commit-LSN to logical replication worker
errcontext (proposed[2])
C. Store error information (e.g., the error message and commit-LSN) to
the system catalog
D. Introduce ALTER SUBSCRIPTION SKIP
E. Record the skipped data somewhere: server logs or a table

Given the remaining time for PG15, it’s unlikely to complete all of
them for PG15 by the feature freeze. The most realistic plan for PG15
in my mind is to complete B and D. With these two items, the LSN of
the error-ed transaction is shown in the server log, and we can ask
users to check server logs for the LSN and use it with ALTER
SUBSCRIPTION SKIP command. If the community agrees with B+D, we will
have a user-visible feature for PG15 which can be further
extended/improved in PG16 by adding C and E. I started a new thread[2]
for B yesterday. In this thread, I'd like to discuss D.

I've attached an updated patch for D and here is the summary:

* Introduce a new command ALTER SUBSCRIPTION ... SKIP (lsn =
'0/1234'). The user can get the commit-LSN of the transaction in
question from the server logs thanks to B[2].
* The user-specified LSN (say skip-LSN) is stored in the
pg_subscription catalog.
* The apply worker skips the whole transaction if the transaction's
commit-LSN exactly matches to skip-LSN.
* The skip-LSN has an effect on only the first non-empty transaction
since the worker started to apply changes. IOW it's cleared after
either skipping the whole transaction or successfully committing a
non-empty transaction, preventing the skip-LSN to remain in the
catalog. Also, since the latter case means that the user set the wrong
skip-LSN we clear it with a warning.
* ALTER SUBSCRIPTION SKIP doesn't support tablesync workers. But it
would not be a problem in practice since an error during table
synchronization is not common and could be resolved by truncating the
table and restarting the synchronization.

For the above reasons, ALTER SUBSCRIPTION SKIP command is safer than
the existing way of using pg_replication_origin_advance().

I've attached an updated patch along with two patches for cfbot tests
since the main patch (0003) depends on the other two patches. Both
0001 and 0002 patches are the same ones I attached on another
thread[2].

Regards,

[1] https://www.postgresql.org/message-id/20220125063131.4cmvsxbz2tdg6g65%40alap3.anarazel.de
[2] https://www.postgresql.org/message-id/CAD21AoBarBf2oTF71ig2g_o%3D3Z_Dt6_sOpMQma1kFgbnA5OZ_w%40mail.gmail.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

On Thu, Mar 10, 2022 at 9:02 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Mar 1, 2022 at 8:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated patch along with two patches for cfbot tests
> > since the main patch (0003) depends on the other two patches. Both
> > 0001 and 0002 patches are the same ones I attached on another
> > thread[2].
> >
>
> Few comments on 0003:
> =====================
> 1.
> +     <row>
> +      <entry role="catalog_table_entry"><para role="column_definition">
> +       <structfield>subskiplsn</structfield> <type>pg_lsn</type>
> +      </para>
> +      <para>
> +       Commit LSN of the transaction whose changes are to be skipped,
> if a valid
> +       LSN; otherwise <literal>0/0</literal>.
> +      </para></entry>
> +     </row>
>
> Can't this be prepared LSN or rollback prepared LSN? Can we say
> Finish/End LSN and then add some details which all LSNs can be there?

Right, changed to finish LSN.

>
> 2. The conflict resolution explanation needs an update after the
> latest commits and we should probably change the commit LSN
> terminology as mentioned in the previous point.

Updated.

>
> 3. The text in alter_subscription.sgml looks a bit repetitive to me
> (similar to what we have in logical-replication.sgml related to
> conflicts). Here also we refer to only commit LSN which needs to be
> changed as mentioned in the previous two points.

Updated.

>
> 4.
> if (strcmp(lsn_str, "none") == 0)
> + {
> + /* Setting lsn = NONE is treated as resetting LSN */
> + lsn = InvalidXLogRecPtr;
> + }
> + else
> + {
> + /* Parse the argument as LSN */
> + lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in,
> + CStringGetDatum(lsn_str)));
> +
> + if (XLogRecPtrIsInvalid(lsn))
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("invalid WAL location (LSN): %s", lsn_str)));
>
> Is there a reason that we don't want to allow setting 0
> (InvalidXLogRecPtr) for skip LSN?

0 is obviously an invalid value for skip LSN, which should not be
allowed similar to other options (like setting '' to slot_name). Also,
we use 0 (InvalidXLogRecPtr) internally to reset the subskipxid when
NONE is specified.

>
> 5.
> +# The subscriber will enter an infinite error loop, so we don't want
> +# to overflow the server log with error messages.
> +$node_subscriber->append_conf(
> + 'postgresql.conf',
> + qq[
> +wal_retrieve_retry_interval = 2s
> +]);
>
> Can we change this test to use disable_on_error feature? I am thinking
> if the disable_on_error feature got committed first, maybe we can have
> one test file for this and disable_on_error feature (something like
> conflicts.pl).

Good idea. Updated.

I've attached an updated version patch. This patch can be applied on
top of the latest disable_on_error patch[1].

Regards,

[1] https://www.postgresql.org/message-id/CAA4eK1Kes9TsMpGL6m%2BAJNHYCGRvx6piYQt5v6TEbH_t9jh8nA%40mail.gmail.com

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v13-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patch

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

11 марта 2022 г., 14:36:55

On Friday, March 11, 2022 5:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've attached an updated version patch. This patch can be applied on top of the
> latest disable_on_error patch[1].
Hi, thank you for the patch. I'll share my review comments on v13.

(a) src/backend/commands/subscriptioncmds.c

@@ -84,6 +86,8 @@ typedef struct SubOpts
        bool            streaming;
        bool            twophase;
        bool            disableonerr;
+       XLogRecPtr      lsn;                    /* InvalidXLogRecPtr for resetting purpose,
+                                                                * otherwise a valid LSN */

I think this explanation is slightly odd and can be improved.
Strictly speaking, I feel a *valid* LSN is for retting transaction purpose
from the functional perspective. Also, the wording "resetting purpose"
is unclear by itself. I'll suggest below change.

From:
InvalidXLogRecPtr for resetting purpose, otherwise a valid LSN
To:
A valid LSN when we skip transaction, otherwise InvalidXLogRecPtr

(b) The code position of additional append in describeSubscriptions

+
+               /* Skip LSN is only supported in v15 and higher */
+               if (pset.sversion >= 150000)
+                       appendPQExpBuffer(&buf,
+                                                         ", subskiplsn AS \"%s\"\n",
+                                                         gettext_noop("Skip LSN"));

I suggest to combine this code after subdisableonerr.

(c) parse_subscription_options

+                               /* Parse the argument as LSN */
+                               lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in,

Here, shouldn't we call DatumGetLSN, instead of DatumGetTransactionId ?

(d) parse_subscription_options

+                       if (strcmp(lsn_str, "none") == 0)
+                       {
+                               /* Setting lsn = NONE is treated as resetting LSN */
+                               lsn = InvalidXLogRecPtr;
+                       }
+

We should remove this pair of curly brackets that is for one sentence.

(e) src/backend/replication/logical/worker.c

+ * to skip applying the changes when starting to apply changes.  The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction, where the later avoids the mistakenly specified subskiplsn from
+ * being left.

typo "the later" -> "the latter"

At the same time, I feel the last part of this sentence can be an independent sentence.
From:
, where the later avoids the mistakenly specified subskiplsn from being left
To:
. The latter prevents the mistakenly specified subskiplsn from being left

* Note that my comments below are applied if we choose we don't merge disable_on_error test with skip lsn tests.

(f) src/test/subscription/t/030_skip_xact.pl

+use Test::More tests => 4;

It's better to utilize the new style for the TAP test.
Then, probably we should introduce done_testing()
at the end of the test.

(g) src/test/subscription/t/030_skip_xact.pl

I think there's no need to create two types of subscriptions.
Just one subscription with two_phase = on and streaming = on
would be sufficient for the tests(normal commit, commit prepared,
stream commit cases). I think this point of view will reduce
the number of the table and the publication, which will
make the whole test simpler.

Best Regards,
    Takamichi Osumi

RE: Skipping logical replication transactions on subscriber side

От

"shiy.fnst@fujitsu.com"

Дата:

14 марта 2022 г., 12:50:41

On Fri, Mar 11, 2022 4:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> I've attached an updated version patch. This patch can be applied on
> top of the latest disable_on_error patch[1].
> 

Thanks for your patch. Here are some comments for the v13 patch.

1. doc/src/sgml/ref/alter_subscription.sgml
+          Specifies the transaction's finish LSN of the remote transaction whose changes

Could it be simplified to "Specifies the finish LSN of the remote transaction
whose ...".

2.
I met a failed assertion, the backtrace is attached. This is caused by the
following code in maybe_start_skipping_changes().

+        /*
+         * It's a rare case; a past subskiplsn was left because the server
+         * crashed after preparing the transaction and before clearing the
+         * subskiplsn. We clear it without a warning message so as not confuse
+         * the user.
+         */
+        if (unlikely(MySubscription->skiplsn < lsn))
+        {
+            clear_subscription_skip_lsn(MySubscription->skiplsn, InvalidXLogRecPtr, 0,
+                                        false);
+            Assert(!IsTransactionState());
+        }

We want to clear subskiplsn in the case mentioned in comment. But if the next
transaction is a steaming transaction and this function is called by
apply_spooled_messages(), we are inside a transaction here. So, I think this
assertion is not suitable for streaming transaction. Thoughts?

3.
+    XLogRecPtr    subskiplsn;        /* All changes which committed at this LSN are
+                                 * skipped */

To be consistent, should the comment be changed to "All changes which finished
at this LSN are skipped"?

4.
+      After logical replication worker successfully skips the transaction or commits
+      non-empty transaction, the LSN (stored in
+      <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
+      is cleared.

Besides "commits non-empty transaction", subskiplsn would also be cleared in
some two-phase commit cases I think. Like prepare/commit/rollback a transaction,
even if it is an empty transaction. So, should we change it for these cases?

5.
+ * Clear subskiplsn of pg_subscription catalog with origin state update.

Should "with origin state update" modified to "with origin state updated"?

Regards,
Shi yu

Вложения

backtrace.txt

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

14 марта 2022 г., 15:39:49

On Friday, March 11, 2022 5:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've attached an updated version patch. This patch can be applied on top of the
> latest disable_on_error patch[1].
Hi, few extra comments on v13.

(1) src/backend/replication/logical/worker.c

With regard to clear_subscription_skip_lsn,
There are cases that we conduct origin state update twice.

For instance, the case we reset subskiplsn by executing an
irrelevant non-empty transaction. The first update is
conducted at apply_handle_commit_internal and the second one
is at clear_subscription_skip_lsn. In the second change,
we update replorigin_session_origin_lsn by smaller value(commit_lsn),
compared to the first update(end_lsn). Were those intentional and OK ?

(2) src/backend/replication/logical/worker.c

+ * Both origin_lsn and origin_timestamp are the remote transaction's end_lsn
+ * and commit timestamp, respectively.
+ */
+static void
+stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts)

Typo. Should change 'origin_timestamp' to 'origin_ts',
because the name of the argument is the latter.

Also, here we handle not only commit but also prepare.
You need to fix the comment "commit timestamp" as well.

(3) src/backend/replication/logical/worker.c

+/*
+ * Clear subskiplsn of pg_subscription catalog with origin state update.
+ *
+ * if with_warning is true, we raise a warning when clearing the subskipxid.

It's better to insert this second sentence as the last sentence of
the other comments. It should start with capital letter as well.

Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 марта 2022 г., 05:51:50

On Mon, Mar 14, 2022 at 6:50 PM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Fri, Mar 11, 2022 4:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated version patch. This patch can be applied on
> > top of the latest disable_on_error patch[1].
> >
>
> Thanks for your patch. Here are some comments for the v13 patch.

Thank you for the comments!

>
> 1. doc/src/sgml/ref/alter_subscription.sgml
> +          Specifies the transaction's finish LSN of the remote transaction whose changes
>
> Could it be simplified to "Specifies the finish LSN of the remote transaction
> whose ...".

Fixed.

>
> 2.
> I met a failed assertion, the backtrace is attached. This is caused by the
> following code in maybe_start_skipping_changes().
>
> +               /*
> +                * It's a rare case; a past subskiplsn was left because the server
> +                * crashed after preparing the transaction and before clearing the
> +                * subskiplsn. We clear it without a warning message so as not confuse
> +                * the user.
> +                */
> +               if (unlikely(MySubscription->skiplsn < lsn))
> +               {
> +                       clear_subscription_skip_lsn(MySubscription->skiplsn, InvalidXLogRecPtr, 0,
> +                                                                               false);
> +                       Assert(!IsTransactionState());
> +               }
>
> We want to clear subskiplsn in the case mentioned in comment. But if the next
> transaction is a steaming transaction and this function is called by
> apply_spooled_messages(), we are inside a transaction here. So, I think this
> assertion is not suitable for streaming transaction. Thoughts?

Good catch. After more thought, I realized that the assumption of this
if statement is wrong and we don't necessarily need to do here since
the left skip-LSN will eventually be cleared when the next transaction
is finished. So removed this part.

>
> 3.
> +       XLogRecPtr      subskiplsn;             /* All changes which committed at this LSN are
> +                                                                * skipped */
>
> To be consistent, should the comment be changed to "All changes which finished
> at this LSN are skipped"?

Fixed.

>
> 4.
> +      After logical replication worker successfully skips the transaction or commits
> +      non-empty transaction, the LSN (stored in
> +      <structname>pg_subscription</structname>.<structfield>subskiplsn</structfield>)
> +      is cleared.
>
> Besides "commits non-empty transaction", subskiplsn would also be cleared in
> some two-phase commit cases I think. Like prepare/commit/rollback a transaction,
> even if it is an empty transaction. So, should we change it for these cases?

Fixed.

>
> 5.
> + * Clear subskiplsn of pg_subscription catalog with origin state update.
>
> Should "with origin state update" modified to "with origin state updated"?

Fixed.

I'll submit an updated patch soon.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

15 марта 2022 г., 09:13:17

Hi,

On Fri, Mar 11, 2022 at 8:37 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Friday, March 11, 2022 5:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached an updated version patch. This patch can be applied on top of the
> > latest disable_on_error patch[1].
> Hi, thank you for the patch. I'll share my review comments on v13.
>
>
> (a) src/backend/commands/subscriptioncmds.c
>
> @@ -84,6 +86,8 @@ typedef struct SubOpts
>         bool            streaming;
>         bool            twophase;
>         bool            disableonerr;
> +       XLogRecPtr      lsn;                    /* InvalidXLogRecPtr for resetting purpose,
> +                                                                * otherwise a valid LSN */
>
>
> I think this explanation is slightly odd and can be improved.
> Strictly speaking, I feel a *valid* LSN is for retting transaction purpose
> from the functional perspective. Also, the wording "resetting purpose"
> is unclear by itself. I'll suggest below change.
>
> From:
> InvalidXLogRecPtr for resetting purpose, otherwise a valid LSN
> To:
> A valid LSN when we skip transaction, otherwise InvalidXLogRecPtr

"when we skip transaction" sounds incorrect to me since it's just an
option value but does not indicate that we really skip the transaction
that has that LSN. I realized that we directly use InvalidXLogRecPtr
for subskiplsn so I think no need to mention it.

>
> (b) The code position of additional append in describeSubscriptions
>
>
> +
> +               /* Skip LSN is only supported in v15 and higher */
> +               if (pset.sversion >= 150000)
> +                       appendPQExpBuffer(&buf,
> +                                                         ", subskiplsn AS \"%s\"\n",
> +                                                         gettext_noop("Skip LSN"));
>
> I suggest to combine this code after subdisableonerr.

I got the comment[1] from Peter to put it at the end, which looks better to me.

>
> (c) parse_subscription_options
>
>
> +                               /* Parse the argument as LSN */
> +                               lsn = DatumGetTransactionId(DirectFunctionCall1(pg_lsn_in,
>
>
> Here, shouldn't we call DatumGetLSN, instead of DatumGetTransactionId ?

Right, fixed.

>
>
> (d) parse_subscription_options
>
> +                       if (strcmp(lsn_str, "none") == 0)
> +                       {
> +                               /* Setting lsn = NONE is treated as resetting LSN */
> +                               lsn = InvalidXLogRecPtr;
> +                       }
> +
>
> We should remove this pair of curly brackets that is for one sentence.

I moved the comment on top of the if statement and removed the brackets.

>
>
> (e) src/backend/replication/logical/worker.c
>
> + * to skip applying the changes when starting to apply changes.  The subskiplsn is
> + * cleared after successfully skipping the transaction or applying non-empty
> + * transaction, where the later avoids the mistakenly specified subskiplsn from
> + * being left.
>
> typo "the later" -> "the latter"
>
> At the same time, I feel the last part of this sentence can be an independent sentence.
> From:
> , where the later avoids the mistakenly specified subskiplsn from being left
> To:
> . The latter prevents the mistakenly specified subskiplsn from being left

Fixed.

>
>
> * Note that my comments below are applied if we choose we don't merge disable_on_error test with skip lsn tests.
>
> (f) src/test/subscription/t/030_skip_xact.pl
>
> +use Test::More tests => 4;
>
> It's better to utilize the new style for the TAP test.
> Then, probably we should introduce done_testing()
> at the end of the test.

Fixed.

>
> (g) src/test/subscription/t/030_skip_xact.pl
>
> I think there's no need to create two types of subscriptions.
> Just one subscription with two_phase = on and streaming = on
> would be sufficient for the tests(normal commit, commit prepared,
> stream commit cases). I think this point of view will reduce
> the number of the table and the publication, which will
> make the whole test simpler.

Good point, fixed.

On Mon, Mar 14, 2022 at 9:39 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Friday, March 11, 2022 5:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached an updated version patch. This patch can be applied on top of the
> > latest disable_on_error patch[1].
> Hi, few extra comments on v13.
>
>
> (1) src/backend/replication/logical/worker.c
>
>
> With regard to clear_subscription_skip_lsn,
> There are cases that we conduct origin state update twice.
>
> For instance, the case we reset subskiplsn by executing an
> irrelevant non-empty transaction. The first update is
> conducted at apply_handle_commit_internal and the second one
> is at clear_subscription_skip_lsn. In the second change,
> we update replorigin_session_origin_lsn by smaller value(commit_lsn),
> compared to the first update(end_lsn). Were those intentional and OK ?

Good catch, this part is removed in the latest patch.

>
>
> (2) src/backend/replication/logical/worker.c
>
> + * Both origin_lsn and origin_timestamp are the remote transaction's end_lsn
> + * and commit timestamp, respectively.
> + */
> +static void
> +stop_skipping_changes(XLogRecPtr origin_lsn, TimestampTz origin_ts)
>
> Typo. Should change 'origin_timestamp' to 'origin_ts',
> because the name of the argument is the latter.
>
> Also, here we handle not only commit but also prepare.
> You need to fix the comment "commit timestamp" as well.

Fixed.

>
> (3) src/backend/replication/logical/worker.c
>
> +/*
> + * Clear subskiplsn of pg_subscription catalog with origin state update.
> + *
> + * if with_warning is true, we raise a warning when clearing the subskipxid.
>
> It's better to insert this second sentence as the last sentence of
> the other comments.

with_warning is removed in the latest patch.

I've attached an updated version patch.

Regards,

[1] https://www.postgresql.org/message-id/09b80566-c790-704b-35b4-33f87befc41f%40enterprisedb.com

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v14-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patch

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

15 марта 2022 г., 13:18:08

On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached an updated version patch.
>

Review:
=======
1.
+++ b/doc/src/sgml/logical-replication.sgml
@@ -366,15 +366,19 @@ CONTEXT:  processing remote data for replication
origin "pg_16395" during "INSER
    transaction, the subscription needs to be disabled temporarily by
    <command>ALTER SUBSCRIPTION ... DISABLE</command> first or
alternatively, the
    subscription can be used with the
<literal>disable_on_error</literal> option.
-   Then, the transaction can be skipped by calling the
+   Then, the transaction can be skipped by using
+   <command>ALTER SUBSCRITPION ... SKIP</command> with the finish LSN
+   (i.e., LSN 0/14C0378). After that the replication
+   can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>.
+   Alternatively, the transaction can also be skipped by calling the

Do we really need to disable the subscription for the skip feature? I
think that is required for origin_advance. Also, probably, we can say
Finish LSN could be Prepare LSN, Commit LSN, etc.

2.
+ /*
+ * Quick return if it's not requested to skip this transaction. This
+ * function is called every start of applying changes and we assume that
+ * skipping the transaction is not used in many cases.
+ */
+ if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn) ||

The second part of this comment (especially ".. every start of
applying changes ..") sounds slightly odd to me. How about changing it
to: "This function is called for every remote transaction and we
assume that skipping the transaction is not used in many cases."

3.
+
+ ereport(LOG,
+ errmsg("start skipping logical replication transaction which
finished at %X/%X",
...
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction which
finished at %X/%X",

No need of 'which' in above LOG messages. I think the message will be
clear without the use of which in above message.

4.
+ ereport(LOG,
+ (errmsg("done skipping logical replication transaction which
finished at %X/%X",
+ LSN_FORMAT_ARGS(skip_xact_finish_lsn))));
+
+ /* Stop skipping changes */
+ skip_xact_finish_lsn = InvalidXLogRecPtr;

Let's reverse the order of these statements to make them consistent
with the corresponding maybe_start_* function.

5.
+
+ if (myskiplsn != finish_lsn)
+ ereport(WARNING,
+ errmsg("skip-LSN of logical replication subscription \"%s\"
cleared", MySubscription->name),

Shouldn't this be a LOG instead of a WARNING as this will be displayed
only in server logs and by background apply worker?

6.
@@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
  TupleTableSlot *remoteslot;
  MemoryContext oldctx;

- if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
+ if (is_skipping_changes() ||

Is there a reason to keep the skip_changes check here and in other DML
operations instead of at one central place in apply_dispatch?

7.
+ /*
+ * Start a new transaction to clear the subskipxid, if not started
+ * yet. The transaction is committed below.
+ */
+ if (!IsTransactionState())

I think the second part of the comment: "The transaction is committed
below." is not required.

8.
+ XLogRecPtr subskiplsn; /* All changes which finished at this LSN are
+ * skipped */
+
 #ifdef CATALOG_VARLEN /* variable-length fields start here */
  /* Connection string to the publisher */
  text subconninfo BKI_FORCE_NOT_NULL;
@@ -109,6 +112,8 @@ typedef struct Subscription
  bool disableonerr; /* Indicates if the subscription should be
  * automatically disabled if a worker error
  * occurs */
+ XLogRecPtr skiplsn; /* All changes which finished at this LSN are
+ * skipped */

No need for 'which' in the above comments.

9.
Can we merge 029_disable_on_error in 030_skip_xact and name it as
029_on_error (or 029_on_error_skip_disable or some variant of it)?
Both seem to be related features. I am slightly worried at the pace at
which the number of test files are growing in subscription test.

-- 
With Regards,
Amit Kapila.

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

15 марта 2022 г., 17:00:52

On Tuesday, March 15, 2022 3:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've attached an updated version patch.

A couple of minor comments on v14.

(1) apply_handle_commit_internal


+       if (is_skipping_changes())
+       {
+               stop_skipping_changes();
+
+               /*
+                * Start a new transaction to clear the subskipxid, if not started
+                * yet. The transaction is committed below.
+                */
+               if (!IsTransactionState())
+                       StartTransactionCommand();
+       }
+

I suppose we can move this condition check and stop_skipping_changes() call
to the inside of the block we enter when IsTransactionState() returns true.

As the comment of apply_handle_commit_internal() mentions,
it's the helper function for apply_handle_commit() and
apply_handle_stream_commit().

Then, I couldn't think that both callers don't open
a transaction before the call of apply_handle_commit_internal().
For applying spooled messages, we call begin_replication_step as well.

I can miss something, but timing when we receive COMMIT message
without opening a transaction, would be the case of empty transactions
where the subscription (and its subscription worker) is not interested.
If this is true, currently the patch's code includes
such cases within the range of is_skipping_changes() check.

(2) clear_subscription_skip_lsn's comments.

The comments for this function shouldn't touch
update of origin states, now that we don't update those.

+/*
+ * Clear subskiplsn of pg_subscription catalog with origin state updated.
+ *


This applies to other comments.

+       /*
+        * Update the subskiplsn of the tuple to InvalidXLogRecPtr.  If user has
+        * already changed subskiplsn before clearing it we don't update the
+        * catalog and don't advance the replication origin state.  
...
+        *            ....                We can reduce the possibility by
+        * logging a replication origin WAL record to advance the origin LSN
+        * instead but there is no way to advance the origin timestamp and it
+        * doesn't seem to be worth doing anything about it since it's a very rare
+        * case.
+        */



Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

16 марта 2022 г., 03:32:24

On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
>
> 6.
> @@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
>   TupleTableSlot *remoteslot;
>   MemoryContext oldctx;
>
> - if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
> + if (is_skipping_changes() ||
>
> Is there a reason to keep the skip_changes check here and in other DML
> operations instead of at one central place in apply_dispatch?

Since we already have the check of applying the change on the spot at
the beginning of the handlers I feel it's better to add
is_skipping_changes() to the check than add a new if statement to
apply_dispatch, but do you prefer to check it in one central place in
apply_dispatch?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

16 марта 2022 г., 05:28:02

On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> >
> > 6.
> > @@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
> >   TupleTableSlot *remoteslot;
> >   MemoryContext oldctx;
> >
> > - if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
> > + if (is_skipping_changes() ||
> >
> > Is there a reason to keep the skip_changes check here and in other DML
> > operations instead of at one central place in apply_dispatch?
>
> Since we already have the check of applying the change on the spot at
> the beginning of the handlers I feel it's better to add
> is_skipping_changes() to the check than add a new if statement to
> apply_dispatch, but do you prefer to check it in one central place in
> apply_dispatch?
>

I think either way is fine. I just wanted to know the reason, your
current change looks okay to me.

Some questions/comments
======================
1. IIRC, earlier, we thought of allowing to use of this option (SKIP)
only for superusers (as this can lead to inconsistent data if not used
carefully) but I don't see that check in the latest patch. What is the
reason for the same?

2.
+ /*
+ * Update the subskiplsn of the tuple to InvalidXLogRecPtr.

I think we can change the above part of the comment to "Clear subskiplsn."

3.
+ * Since we already have

Isn't it better to say here: Since we have already ...?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

16 марта 2022 г., 05:32:31

On Tue, Mar 15, 2022 at 7:30 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Tuesday, March 15, 2022 3:13 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > I've attached an updated version patch.
>
> A couple of minor comments on v14.
>
> (1) apply_handle_commit_internal
>
>
> +       if (is_skipping_changes())
> +       {
> +               stop_skipping_changes();
> +
> +               /*
> +                * Start a new transaction to clear the subskipxid, if not started
> +                * yet. The transaction is committed below.
> +                */
> +               if (!IsTransactionState())
> +                       StartTransactionCommand();
> +       }
> +
>
> I suppose we can move this condition check and stop_skipping_changes() call
> to the inside of the block we enter when IsTransactionState() returns true.
>
> As the comment of apply_handle_commit_internal() mentions,
> it's the helper function for apply_handle_commit() and
> apply_handle_stream_commit().
>
> Then, I couldn't think that both callers don't open
> a transaction before the call of apply_handle_commit_internal().
> For applying spooled messages, we call begin_replication_step as well.
>
> I can miss something, but timing when we receive COMMIT message
> without opening a transaction, would be the case of empty transactions
> where the subscription (and its subscription worker) is not interested.
>

I think when we skip non-streamed transactions we don't start a
transaction. So, if we do what you are suggesting, we will miss to
clear the skip_lsn after skipping the transaction.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

16 марта 2022 г., 05:34:44

On Wed, Mar 16, 2022 at 7:58 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > >
> > > 6.
> > > @@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
> > >   TupleTableSlot *remoteslot;
> > >   MemoryContext oldctx;
> > >
> > > - if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
> > > + if (is_skipping_changes() ||
> > >
> > > Is there a reason to keep the skip_changes check here and in other DML
> > > operations instead of at one central place in apply_dispatch?
> >
> > Since we already have the check of applying the change on the spot at
> > the beginning of the handlers I feel it's better to add
> > is_skipping_changes() to the check than add a new if statement to
> > apply_dispatch, but do you prefer to check it in one central place in
> > apply_dispatch?
> >
>
> I think either way is fine. I just wanted to know the reason, your
> current change looks okay to me.
>

I feel it is better to at least add a comment suggesting that we skip
only data modification changes because the other part of message
handle_stream_* is there in other message handlers as well. It will
make it easier to add a similar check in future message handlers.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

16 марта 2022 г., 07:20:25

On Wed, Mar 16, 2022 at 7:58 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > >
> > > 6.
> > > @@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
> > >   TupleTableSlot *remoteslot;
> > >   MemoryContext oldctx;
> > >
> > > - if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
> > > + if (is_skipping_changes() ||
> > >
> > > Is there a reason to keep the skip_changes check here and in other DML
> > > operations instead of at one central place in apply_dispatch?
> >
> > Since we already have the check of applying the change on the spot at
> > the beginning of the handlers I feel it's better to add
> > is_skipping_changes() to the check than add a new if statement to
> > apply_dispatch, but do you prefer to check it in one central place in
> > apply_dispatch?
> >
>
> I think either way is fine. I just wanted to know the reason, your
> current change looks okay to me.
>
> Some questions/comments
> ======================
>

Some cosmetic suggestions:
======================
1.
+# Create subscriptions. Both subscription sets disable_on_error to on
+# so that they get disabled when a conflict occurs.
+$node_subscriber->safe_psql(
+ 'postgres',
+ qq[
+CREATE SUBSCRIPTION $subname CONNECTION '$publisher_connstr'
PUBLICATION tap_pub WITH (streaming = on, two_phase = on,
disable_on_error = on);
+]);

I don't understand what you mean by 'Both subscription ...' in the
above comments.

2.
+ # Check the log indicating that successfully skipped the transaction,

How about slightly rephrasing this to: "Check the log to ensure that
the transaction is skipped...."?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

16 марта 2022 г., 09:14:25

On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated version patch.
> >
>
> Review:
> =======

Thank you for the comments.

> 1.
> +++ b/doc/src/sgml/logical-replication.sgml
> @@ -366,15 +366,19 @@ CONTEXT:  processing remote data for replication
> origin "pg_16395" during "INSER
>     transaction, the subscription needs to be disabled temporarily by
>     <command>ALTER SUBSCRIPTION ... DISABLE</command> first or
> alternatively, the
>     subscription can be used with the
> <literal>disable_on_error</literal> option.
> -   Then, the transaction can be skipped by calling the
> +   Then, the transaction can be skipped by using
> +   <command>ALTER SUBSCRITPION ... SKIP</command> with the finish LSN
> +   (i.e., LSN 0/14C0378). After that the replication
> +   can be resumed by <command>ALTER SUBSCRIPTION ... ENABLE</command>.
> +   Alternatively, the transaction can also be skipped by calling the
>
> Do we really need to disable the subscription for the skip feature? I
> think that is required for origin_advance. Also, probably, we can say
> Finish LSN could be Prepare LSN, Commit LSN, etc.

Not necessary to disable the subscription for skip feature. Fixed.

>
> 2.
> + /*
> + * Quick return if it's not requested to skip this transaction. This
> + * function is called every start of applying changes and we assume that
> + * skipping the transaction is not used in many cases.
> + */
> + if (likely(XLogRecPtrIsInvalid(MySubscription->skiplsn) ||
>
> The second part of this comment (especially ".. every start of
> applying changes ..") sounds slightly odd to me. How about changing it
> to: "This function is called for every remote transaction and we
> assume that skipping the transaction is not used in many cases."
>

Fixed.

> 3.
> +
> + ereport(LOG,
> + errmsg("start skipping logical replication transaction which
> finished at %X/%X",
> ...
> + ereport(LOG,
> + (errmsg("done skipping logical replication transaction which
> finished at %X/%X",
>
> No need of 'which' in above LOG messages. I think the message will be
> clear without the use of which in above message.

Removed.

>
> 4.
> + ereport(LOG,
> + (errmsg("done skipping logical replication transaction which
> finished at %X/%X",
> + LSN_FORMAT_ARGS(skip_xact_finish_lsn))));
> +
> + /* Stop skipping changes */
> + skip_xact_finish_lsn = InvalidXLogRecPtr;
>
> Let's reverse the order of these statements to make them consistent
> with the corresponding maybe_start_* function.

But we cannot simply rever the order since skip_xact_finish_lsn is
used in the log message. Do we want to use a variable for it?

>
> 5.
> +
> + if (myskiplsn != finish_lsn)
> + ereport(WARNING,
> + errmsg("skip-LSN of logical replication subscription \"%s\"
> cleared", MySubscription->name),
>
> Shouldn't this be a LOG instead of a WARNING as this will be displayed
> only in server logs and by background apply worker?

WARNINGs are used also by other auxiliary processes such as archiver,
autovacuum workers, and launcher. So I think we can use it here.

>
> 6.
> @@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
>   TupleTableSlot *remoteslot;
>   MemoryContext oldctx;
>
> - if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
> + if (is_skipping_changes() ||
>
> Is there a reason to keep the skip_changes check here and in other DML
> operations instead of at one central place in apply_dispatch?

I'd leave it as is as I mentioned in another email. But I've added
some comments as you suggested.

>
> 7.
> + /*
> + * Start a new transaction to clear the subskipxid, if not started
> + * yet. The transaction is committed below.
> + */
> + if (!IsTransactionState())
>
> I think the second part of the comment: "The transaction is committed
> below." is not required.

Removed.

>
> 8.
> + XLogRecPtr subskiplsn; /* All changes which finished at this LSN are
> + * skipped */
> +
>  #ifdef CATALOG_VARLEN /* variable-length fields start here */
>   /* Connection string to the publisher */
>   text subconninfo BKI_FORCE_NOT_NULL;
> @@ -109,6 +112,8 @@ typedef struct Subscription
>   bool disableonerr; /* Indicates if the subscription should be
>   * automatically disabled if a worker error
>   * occurs */
> + XLogRecPtr skiplsn; /* All changes which finished at this LSN are
> + * skipped */
>
> No need for 'which' in the above comments.

Removed.

>
> 9.
> Can we merge 029_disable_on_error in 030_skip_xact and name it as
> 029_on_error (or 029_on_error_skip_disable or some variant of it)?
> Both seem to be related features. I am slightly worried at the pace at
> which the number of test files are growing in subscription test.

Yes, we can merge them.

I'll submit an updated version patch after incorporating all comments I got.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

16 марта 2022 г., 09:36:52

On Wednesday, March 16, 2022 11:33 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Mar 15, 2022 at 7:30 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On Tuesday, March 15, 2022 3:13 PM Masahiko Sawada
> <sawada.mshk@gmail.com> wrote:
> > > I've attached an updated version patch.
> >
> > A couple of minor comments on v14.
> >
> > (1) apply_handle_commit_internal
> >
> >
> > +       if (is_skipping_changes())
> > +       {
> > +               stop_skipping_changes();
> > +
> > +               /*
> > +                * Start a new transaction to clear the subskipxid, if not
> started
> > +                * yet. The transaction is committed below.
> > +                */
> > +               if (!IsTransactionState())
> > +                       StartTransactionCommand();
> > +       }
> > +
> >
> > I suppose we can move this condition check and stop_skipping_changes()
> > call to the inside of the block we enter when IsTransactionState() returns
> true.
> >
> > As the comment of apply_handle_commit_internal() mentions, it's the
> > helper function for apply_handle_commit() and
> > apply_handle_stream_commit().
> >
> > Then, I couldn't think that both callers don't open a transaction
> > before the call of apply_handle_commit_internal().
> > For applying spooled messages, we call begin_replication_step as well.
> >
> > I can miss something, but timing when we receive COMMIT message
> > without opening a transaction, would be the case of empty transactions
> > where the subscription (and its subscription worker) is not interested.
> >
> 
> I think when we skip non-streamed transactions we don't start a transaction.
> So, if we do what you are suggesting, we will miss to clear the skip_lsn after
> skipping the transaction.
OK, this is what I missed.

On the other hand, what I was worried about is that
empty transaction can start skipping changes,
if the subskiplsn is equal to the finish LSN for
the empty transaction. The reason is we call
maybe_start_skipping_changes even for empty ones
and set skip_xact_finish_lsn by the finish LSN in that case.

I checked I could make this happen with debugger and some logs for LSN.
What I did is just having two pairs of pub/sub
and conduct a change for one of them,
after I set a breakpoint in the logicalrep_write_begin
on the walsender that will issue an empty transaction.
Then, I check the finish LSN of it and
conduct an alter subscription skip lsn command with this LSN value.
As a result, empty transaction calls stop_skipping_changes
in the apply_handle_commit_internal and then
enter the block for IsTransactionState == true,
which would not happen before applying the patch.

Also, this behavior looks contradicted with some comments in worker.c
"The subskiplsn is cleared after successfully skipping the transaction
or applying non-empty transaction." so, I was just confused and
wrote the above comment.

I think this would not happen in practice, then
it might be OK without a special measure for this,
but I wasn't sure.

Best Regards,
    Takamichi Osumi

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

16 марта 2022 г., 09:57:34

On Wednesday, March 16, 2022 3:37 PM I wrote:
> On Wednesday, March 16, 2022 11:33 AM Amit Kapila
> <amit.kapila16@gmail.com> wrote:
> > On Tue, Mar 15, 2022 at 7:30 PM osumi.takamichi@fujitsu.com
> > <osumi.takamichi@fujitsu.com> wrote:
> > >
> > > On Tuesday, March 15, 2022 3:13 PM Masahiko Sawada
> > <sawada.mshk@gmail.com> wrote:
> > > > I've attached an updated version patch.
> > >
> > > A couple of minor comments on v14.
> > >
> > > (1) apply_handle_commit_internal
> > >
> > >
> > > +       if (is_skipping_changes())
> > > +       {
> > > +               stop_skipping_changes();
> > > +
> > > +               /*
> > > +                * Start a new transaction to clear the subskipxid,
> > > + if not
> > started
> > > +                * yet. The transaction is committed below.
> > > +                */
> > > +               if (!IsTransactionState())
> > > +                       StartTransactionCommand();
> > > +       }
> > > +
> > >
> > > I suppose we can move this condition check and
> > > stop_skipping_changes() call to the inside of the block we enter
> > > when IsTransactionState() returns
> > true.
> > >
> > > As the comment of apply_handle_commit_internal() mentions, it's the
> > > helper function for apply_handle_commit() and
> > > apply_handle_stream_commit().
> > >
> > > Then, I couldn't think that both callers don't open a transaction
> > > before the call of apply_handle_commit_internal().
> > > For applying spooled messages, we call begin_replication_step as well.
> > >
> > > I can miss something, but timing when we receive COMMIT message
> > > without opening a transaction, would be the case of empty
> > > transactions where the subscription (and its subscription worker) is not
> interested.
> > >
> >
> > I think when we skip non-streamed transactions we don't start a transaction.
> > So, if we do what you are suggesting, we will miss to clear the
> > skip_lsn after skipping the transaction.
> OK, this is what I missed.
> 
> On the other hand, what I was worried about is that empty transaction can start
> skipping changes, if the subskiplsn is equal to the finish LSN for the empty
> transaction. The reason is we call maybe_start_skipping_changes even for
> empty ones and set skip_xact_finish_lsn by the finish LSN in that case.
> 
> I checked I could make this happen with debugger and some logs for LSN.
> What I did is just having two pairs of pub/sub and conduct a change for one of
> them, after I set a breakpoint in the logicalrep_write_begin on the walsender
> that will issue an empty transaction.
> Then, I check the finish LSN of it and
> conduct an alter subscription skip lsn command with this LSN value.
> As a result, empty transaction calls stop_skipping_changes in the
> apply_handle_commit_internal and then enter the block for IsTransactionState
> == true, which would not happen before applying the patch.
> 
> Also, this behavior looks contradicted with some comments in worker.c "The
> subskiplsn is cleared after successfully skipping the transaction or applying
> non-empty transaction." so, I was just confused and wrote the above comment.
Sorry, my understanding was not correct.

Even when we clear the subskiplsn by empty transaction,
we can say that it applies to the success of skipping the transaction.
Then this behavior and allowing empty transaction to match the indicated
LSN by alter subscription is fine.

I'm sorry for making noises.


Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

16 марта 2022 г., 10:07:07

On Wed, Mar 16, 2022 at 11:28 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > >
> > > 6.
> > > @@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
> > >   TupleTableSlot *remoteslot;
> > >   MemoryContext oldctx;
> > >
> > > - if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
> > > + if (is_skipping_changes() ||
> > >
> > > Is there a reason to keep the skip_changes check here and in other DML
> > > operations instead of at one central place in apply_dispatch?
> >
> > Since we already have the check of applying the change on the spot at
> > the beginning of the handlers I feel it's better to add
> > is_skipping_changes() to the check than add a new if statement to
> > apply_dispatch, but do you prefer to check it in one central place in
> > apply_dispatch?
> >
>
> I think either way is fine. I just wanted to know the reason, your
> current change looks okay to me.
>
> Some questions/comments
> ======================
> 1. IIRC, earlier, we thought of allowing to use of this option (SKIP)
> only for superusers (as this can lead to inconsistent data if not used
> carefully) but I don't see that check in the latest patch. What is the
> reason for the same?

I thought the non-superuser subscription owner can resolve the
conflict by manuall manipulating the relations, which is the same
result of skipping all data modification changes by ALTER SUBSCRIPTION
SKIP feature. But after more thought, it would not be exactly the same
since the skipped transaction might include changes to the relation
that the owner doesn't have permission on it.

>
> 2.
> + /*
> + * Update the subskiplsn of the tuple to InvalidXLogRecPtr.
>
> I think we can change the above part of the comment to "Clear subskiplsn."
>

Fixed.

> 3.
> + * Since we already have
>
> Isn't it better to say here: Since we have already ...?

Fixed.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

16 марта 2022 г., 11:22:35

On Wed, Mar 16, 2022 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 16, 2022 at 7:58 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Mar 16, 2022 at 6:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Mar 15, 2022 at 7:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Tue, Mar 15, 2022 at 11:43 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > >
> > > > 6.
> > > > @@ -1583,7 +1649,8 @@ apply_handle_insert(StringInfo s)
> > > >   TupleTableSlot *remoteslot;
> > > >   MemoryContext oldctx;
> > > >
> > > > - if (handle_streamed_transaction(LOGICAL_REP_MSG_INSERT, s))
> > > > + if (is_skipping_changes() ||
> > > >
> > > > Is there a reason to keep the skip_changes check here and in other DML
> > > > operations instead of at one central place in apply_dispatch?
> > >
> > > Since we already have the check of applying the change on the spot at
> > > the beginning of the handlers I feel it's better to add
> > > is_skipping_changes() to the check than add a new if statement to
> > > apply_dispatch, but do you prefer to check it in one central place in
> > > apply_dispatch?
> > >
> >
> > I think either way is fine. I just wanted to know the reason, your
> > current change looks okay to me.
> >
> > Some questions/comments
> > ======================
> >
>
> Some cosmetic suggestions:
> ======================
> 1.
> +# Create subscriptions. Both subscription sets disable_on_error to on
> +# so that they get disabled when a conflict occurs.
> +$node_subscriber->safe_psql(
> + 'postgres',
> + qq[
> +CREATE SUBSCRIPTION $subname CONNECTION '$publisher_connstr'
> PUBLICATION tap_pub WITH (streaming = on, two_phase = on,
> disable_on_error = on);
> +]);
>
> I don't understand what you mean by 'Both subscription ...' in the
> above comments.

Fixed.

>
> 2.
> + # Check the log indicating that successfully skipped the transaction,
>
> How about slightly rephrasing this to: "Check the log to ensure that
> the transaction is skipped...."?

Fixed.

I've attached an updated version patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

v15-0001-Add-ALTER-SUBSCRIPTION-.-SKIP-to-skip-the-transa.patch

RE: Skipping logical replication transactions on subscriber side

От

"shiy.fnst@fujitsu.com"

Дата:

17 марта 2022 г., 05:43:31

On Wed, Mar 16, 2022 4:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> 
> I've attached an updated version patch.
> 

Thanks for updating the patch. Here are some comments for the v15 patch.

1. src/backend/replication/logical/worker.c

+ * to skip applying the changes when starting to apply changes.  The subskiplsn is
+ * cleared after successfully skipping the transaction or applying non-empty
+ * transaction. The latter prevents the mistakenly specified subskiplsn from

Should "applying non-empty transaction" be modified to "finishing a
transaction"? To be consistent with the description in the
alter_subscription.sgml.

2. src/test/subscription/t/029_on_error.pl

+# Test of logical replication subscription self-disabling feature.

Should we add something about "skip logical replication transactions" in this
comment?

Regards,
Shi yu

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

17 марта 2022 г., 06:29:53

On Thu, Mar 17, 2022 at 8:13 AM shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Wed, Mar 16, 2022 4:23 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated version patch.
> >
>
> Thanks for updating the patch. Here are some comments for the v15 patch.
>
> 1. src/backend/replication/logical/worker.c
>
> + * to skip applying the changes when starting to apply changes.  The subskiplsn is
> + * cleared after successfully skipping the transaction or applying non-empty
> + * transaction. The latter prevents the mistakenly specified subskiplsn from
>
> Should "applying non-empty transaction" be modified to "finishing a
> transaction"? To be consistent with the description in the
> alter_subscription.sgml.
>

The current wording in the patch seems okay to me as it is good to
emphasize on non-empty transactions.

> 2. src/test/subscription/t/029_on_error.pl
>
> +# Test of logical replication subscription self-disabling feature.
>
> Should we add something about "skip logical replication transactions" in this
> comment?
>

How about: "Tests for disable_on_error and SKIP transaction features."?

I am making some other minor edits in the patch and will take care of
whatever we decide for these comments.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

17 марта 2022 г., 09:03:42

On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached an updated version patch.
>

The patch LGTM. I have made minor changes in comments and docs in the
attached patch. Kindly let me know what you think of the attached?

I am planning to commit this early next week (on Monday) unless there
are more comments/suggestions.

-- 
With Regards,
Amit Kapila.

Вложения

v16-0001-Add-ALTER-SUBSCRIPTION-.-SKIP.patch

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

17 марта 2022 г., 10:09:29

On Thursday, March 17, 2022 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada
> <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated version patch.
> >
> 
> The patch LGTM. I have made minor changes in comments and docs in the
> attached patch. Kindly let me know what you think of the attached?
Hi, thank you for the patch. Few minor comments.

(1) comment of maybe_start_skipping_changes

+       /*
+        * Quick return if it's not requested to skip this transaction. This
+        * function is called for every remote transaction and we assume that
+        * skipping the transaction is not used often.
+        */

I feel this comment should explain more about our intention and
what it confirms. In a case when user requests skip,
but it doesn't match the condition, we don't start
skipping changes, strictly speaking.

From:
Quick return if it's not requested to skip this transaction.

To:
Quick return if we can't ensure possible skiplsn is set
and it equals to the finish LSN of this transaction.

(2) 029_on_error.pl

+       my $contents = slurp_file($node_subscriber->logfile, $offset);
+       $contents =~
+         qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation
"public.tbl"in transaction \d+ finishe$

+         or die "could not get error-LSN";

I think we shouldn't use a lot of new words.

How about a change below  ?

From:
could not get error-LSN
To:
failed to find expected error message that contains finish LSN for SKIP option

(3) apply_handle_commit_internal

Lastly, may I have the reasons to call both
stop_skipping_changes and clear_subscription_skip_lsn
in this function, instead of having them at the end
of apply_handle_commit and apply_handle_stream_commit ?

IMHO, this structure looks to create the
extra condition branches in apply_handle_commit_internal.

Also, because of this code, when we call stop_skipping_changes
in the apply_handle_commit_internal, after checking
is_skipping_changes() returns true, we check another
is_skipping_changes() at the top of stop_skipping_changes.

OTOH, for other cases like apply_handle_prepare, apply_handle_stream_prepare,
we call those two functions (or either one) depending on the needs,
after existing commits and during the closing processing.
(In the case of rollback_prepare, it's also called after existing commit)

I feel if we move those two functions at the end
of the apply_handle_commit and apply_handle_stream_commit,
then we will have more aligned codes and improve readability.

Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

17 марта 2022 г., 11:52:20

On Thu, Mar 17, 2022 at 12:39 PM osumi.takamichi@fujitsu.com
<osumi.takamichi@fujitsu.com> wrote:
>
> On Thursday, March 17, 2022 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada
> > <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached an updated version patch.
> > >
> >
> > The patch LGTM. I have made minor changes in comments and docs in the
> > attached patch. Kindly let me know what you think of the attached?
> Hi, thank you for the patch. Few minor comments.
>
>
> (1) comment of maybe_start_skipping_changes
>
>
> +       /*
> +        * Quick return if it's not requested to skip this transaction. This
> +        * function is called for every remote transaction and we assume that
> +        * skipping the transaction is not used often.
> +        */
>
> I feel this comment should explain more about our intention and
> what it confirms. In a case when user requests skip,
> but it doesn't match the condition, we don't start
> skipping changes, strictly speaking.
>
> From:
> Quick return if it's not requested to skip this transaction.
>
> To:
> Quick return if we can't ensure possible skiplsn is set
> and it equals to the finish LSN of this transaction.
>

Hmm, the current comment seems more appropriate. What you are
suggesting is almost writing the code in sentence form.

>
> (2) 029_on_error.pl
>
> +       my $contents = slurp_file($node_subscriber->logfile, $offset);
> +       $contents =~
> +         qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target relation
"public.tbl"in transaction \d+ finishe$
 
> +         or die "could not get error-LSN";
>
> I think we shouldn't use a lot of new words.
>
> How about a change below  ?
>
> From:
> could not get error-LSN
> To:
> failed to find expected error message that contains finish LSN for SKIP option
>
>
> (3) apply_handle_commit_internal
>
...
>
> I feel if we move those two functions at the end
> of the apply_handle_commit and apply_handle_stream_commit,
> then we will have more aligned codes and improve readability.
>

I think the intention is to avoid duplicate code as we have a common
function that gets called from both of those. OTOH, if Sawada-San or
others also prefer your approach to rearrange the code then I am fine
with it.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

17 марта 2022 г., 13:55:40

On Thu, Mar 17, 2022 at 5:52 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Mar 17, 2022 at 12:39 PM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > On Thursday, March 17, 2022 3:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada
> > > <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I've attached an updated version patch.
> > > >
> > >
> > > The patch LGTM. I have made minor changes in comments and docs in the
> > > attached patch. Kindly let me know what you think of the attached?
> > Hi, thank you for the patch. Few minor comments.
> >
> >
> > (1) comment of maybe_start_skipping_changes
> >
> >
> > +       /*
> > +        * Quick return if it's not requested to skip this transaction. This
> > +        * function is called for every remote transaction and we assume that
> > +        * skipping the transaction is not used often.
> > +        */
> >
> > I feel this comment should explain more about our intention and
> > what it confirms. In a case when user requests skip,
> > but it doesn't match the condition, we don't start
> > skipping changes, strictly speaking.
> >
> > From:
> > Quick return if it's not requested to skip this transaction.
> >
> > To:
> > Quick return if we can't ensure possible skiplsn is set
> > and it equals to the finish LSN of this transaction.
> >
>
> Hmm, the current comment seems more appropriate. What you are
> suggesting is almost writing the code in sentence form.
>
> >
> > (2) 029_on_error.pl
> >
> > +       my $contents = slurp_file($node_subscriber->logfile, $offset);
> > +       $contents =~
> > +         qr/processing remote data for replication origin \"pg_\d+\" during "INSERT" for replication target
relation"public.tbl" in transaction \d+ finishe$ 
> > +         or die "could not get error-LSN";
> >
> > I think we shouldn't use a lot of new words.
> >
> > How about a change below  ?
> >
> > From:
> > could not get error-LSN
> > To:
> > failed to find expected error message that contains finish LSN for SKIP option
> >
> >
> > (3) apply_handle_commit_internal
> >
> ...
> >
> > I feel if we move those two functions at the end
> > of the apply_handle_commit and apply_handle_stream_commit,
> > then we will have more aligned codes and improve readability.
> >

I think we cannot just move them to the end of apply_handle_commit()
and apply_handle_stream_commit(). Because if we do that, we end up
missing updating replication_session_origin_lsn/timestamp when
clearing the subskiplsn if we're skipping a non-stream transaction.

Basically, the apply worker differently handles 2pc transactions and
non-2pc transactions; we always prepare even empty transactions
whereas we don't commit empty non-2pc transactions. So I think we
don’t have to handle both in the same way.

> I think the intention is to avoid duplicate code as we have a common
> function that gets called from both of those.

Yes.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

RE: Skipping logical replication transactions on subscriber side

От

"osumi.takamichi@fujitsu.com"

Дата:

17 марта 2022 г., 15:16:15

On Thursday, March 17, 2022 7:56 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
 On Thu, Mar 17, 2022 at 5:52 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> >
> > On Thu, Mar 17, 2022 at 12:39 PM osumi.takamichi@fujitsu.com
> > <osumi.takamichi@fujitsu.com> wrote:
> > >
> > > On Thursday, March 17, 2022 3:04 PM Amit Kapila
> <amit.kapila16@gmail.com> wrote:
> > > > On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada
> > > > <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > I've attached an updated version patch.
> > > > >
> > > >
> > > > The patch LGTM. I have made minor changes in comments and docs in
> > > > the attached patch. Kindly let me know what you think of the attached?
> > > Hi, thank you for the patch. Few minor comments.
> > >
> > >
> > > (3) apply_handle_commit_internal
> > >
> > ...
> > >
> > > I feel if we move those two functions at the end of the
> > > apply_handle_commit and apply_handle_stream_commit, then we will
> > > have more aligned codes and improve readability.
> > >
> 
> I think we cannot just move them to the end of apply_handle_commit() and
> apply_handle_stream_commit(). Because if we do that, we end up missing
> updating replication_session_origin_lsn/timestamp when clearing the
> subskiplsn if we're skipping a non-stream transaction.
> 
> Basically, the apply worker differently handles 2pc transactions and non-2pc
> transactions; we always prepare even empty transactions whereas we don't
> commit empty non-2pc transactions. So I think we don’t have to handle both in
> the same way.
Okay. Thank you so much for your explanation.
Then the code looks good to me.


Best Regards,
    Takamichi Osumi

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 марта 2022 г., 03:16:20

On Thu, Mar 17, 2022 at 3:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated version patch.
> >
>
> The patch LGTM. I have made minor changes in comments and docs in the
> attached patch. Kindly let me know what you think of the attached?

Thank you for updating the patch. It looks good to me.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

"Euler Taveira"

Дата:

21 марта 2022 г., 04:39:25

On Thu, Mar 17, 2022, at 3:03 AM, Amit Kapila wrote:

On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached an updated version patch.
>

The patch LGTM. I have made minor changes in comments and docs in the
attached patch. Kindly let me know what you think of the attached?

I am planning to commit this early next week (on Monday) unless there
are more comments/suggestions.

I reviewed this last version and I have a few comments.

+ * If the user set subskiplsn, we do a sanity check to make

+ * sure that the specified LSN is a probable value.

... user *sets*...

+ ereport(ERROR,

+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),

+ errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X",

+ LSN_FORMAT_ARGS(remote_lsn))));

Shouldn't we add the LSN to be skipped in the "(LSN)"?

+ * Start a new transaction to clear the subskipxid, if not started

+ * yet.

It seems it means subskiplsn.

+ * subskipxid in order to inform users for cases e.g., where the user mistakenly

+ * specified the wrong subskiplsn.

It seems it means subskiplsn.

+sub test_skip_xact

It seems this function should be named test_skip_lsn. Unless the intention is

to cover other skip options in the future.

src/test/subscription/t/029_disable_on_error.pl | 94 ----------

src/test/subscription/t/029_on_error.pl | 183 +++++++++++++++++++

It seems you are removing a test for 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.

I should also name 029_on_error.pl to something else such as 030_skip_lsn.pl or

a generic name 030_skip_option.pl.

Euler Taveira

EDB https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

21 марта 2022 г., 05:19:43

On Mon, Mar 21, 2022 at 7:09 AM Euler Taveira <euler@eulerto.com> wrote:
>
> src/test/subscription/t/029_disable_on_error.pl |  94 ----------
> src/test/subscription/t/029_on_error.pl         | 183 +++++++++++++++++++
>
> It seems you are removing a test for 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.
>

We have covered the same test in the new test file. See "CREATE
SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH
(disable_on_error = true, ...". This will test the cases we were
earlier testing via 'disable_on_error'.

> I should also name 029_on_error.pl to something else such as 030_skip_lsn.pl or
> a generic name 030_skip_option.pl.
>

The reason to keep the name 'on_error' is that it has tests for both
'disable_on_error' option and 'skip_lsn'. The other option could be
'on_error_action' or something like that. Now, does this make sense to
you?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

21 марта 2022 г., 06:25:51

On Mon, Mar 21, 2022 at 7:09 AM Euler Taveira <euler@eulerto.com> wrote:
>
> On Thu, Mar 17, 2022, at 3:03 AM, Amit Kapila wrote:
>
> On Wed, Mar 16, 2022 at 1:53 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached an updated version patch.
> >
>
> The patch LGTM. I have made minor changes in comments and docs in the
> attached patch. Kindly let me know what you think of the attached?
>
> I am planning to commit this early next week (on Monday) unless there
> are more comments/suggestions.
>
> I reviewed this last version and I have a few comments.
>
> +                * If the user set subskiplsn, we do a sanity check to make
> +                * sure that the specified LSN is a probable value.
>
> ... user *sets*...
>
> +                       ereport(ERROR,
> +                               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                                errmsg("skip WAL location (LSN) must be greater than origin LSN %X/%X",
> +                                       LSN_FORMAT_ARGS(remote_lsn))));
>
> Shouldn't we add the LSN to be skipped in the "(LSN)"?
>
> +        * Start a new transaction to clear the subskipxid, if not started
> +        * yet.
>
> It seems it means subskiplsn.
>
> + * subskipxid in order to inform users for cases e.g., where the user mistakenly
> + * specified the wrong subskiplsn.
>
> It seems it means subskiplsn.
>
> +sub test_skip_xact
> +{
>
> It seems this function should be named test_skip_lsn. Unless the intention is
> to cover other skip options in the future.
>

I have fixed all the above comments as per your suggestion in the
attached. Do let me know if something is missed?

> src/test/subscription/t/029_disable_on_error.pl |  94 ----------
> src/test/subscription/t/029_on_error.pl         | 183 +++++++++++++++++++
>
> It seems you are removing a test for 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.
> I should also name 029_on_error.pl to something else such as 030_skip_lsn.pl or
> a generic name 030_skip_option.pl.
>

As explained in my previous email, I don't think any change is
required for this comment but do let me know if you still think so?

-- 
With Regards,
Amit Kapila.

Вложения

v17-0001-Add-ALTER-SUBSCRIPTION-.-SKIP.patch

Re: Skipping logical replication transactions on subscriber side

От

"Euler Taveira"

Дата:

21 марта 2022 г., 15:21:18

On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:

I have fixed all the above comments as per your suggestion in the
attached. Do let me know if something is missed?

Looks good to me.

> src/test/subscription/t/029_disable_on_error.pl | 94 ----------
> src/test/subscription/t/029_on_error.pl | 183 +++++++++++++++++++
>
> It seems you are removing a test for 705e20f8550c0e8e47c0b6b20b5f5ffd6ffd9e33.
> I should also name 029_on_error.pl to something else such as 030_skip_lsn.pl or
> a generic name 030_skip_option.pl.
>

As explained in my previous email, I don't think any change is
required for this comment but do let me know if you still think so?

Oh, sorry about the noise. I saw mixed tests between the 2 new features and I

was confused if it was intentional or not.

Euler Taveira

EDB https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

29 марта 2022 г., 08:13:00

On Mon, Mar 21, 2022 at 5:51 PM Euler Taveira <euler@eulerto.com> wrote:
>
> On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
>
> I have fixed all the above comments as per your suggestion in the
> attached. Do let me know if something is missed?
>
> Looks good to me.
>

This patch is committed
(https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=208c5d65bbd60e33e272964578cb74182ac726a8).
Today, I have marked the corresponding entry in CF as committed.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

01 апреля 2022 г., 10:44:23

On Tue, Mar 29, 2022 at 10:43:00AM +0530, Amit Kapila wrote:
> On Mon, Mar 21, 2022 at 5:51 PM Euler Taveira <euler@eulerto.com> wrote:
> > On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
> > I have fixed all the above comments as per your suggestion in the
> > attached. Do let me know if something is missed?
> >
> > Looks good to me.
> 
> This patch is committed
> (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=208c5d65bbd60e33e272964578cb74182ac726a8).

src/test/subscription/t/029_on_error.pl has been failing reliably on the five
AIX buildfarm members:

# poll_query_until timed out executing this query:
# SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
# expecting this output:
# t
# last actual query output:
# f
# with stderr:
timed out waiting for match: (?^:LOG:  done skipping logical replication transaction finished at 0/1D30788) at
t/029_on_error.plline 50.

I've posted five sets of logs (2.7 MiB compressed) here:
https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing

The members have not actually uploaded these failures, due to an OOM in the
Perl process driving the buildfarm script.  I think the OOM is due to a need
for excess RAM to capture 029_on_error_subscriber.log, which is 27MB here.  I
will move the members to 64-bit Perl.  (AIX 32-bit binaries OOM easily:
https://www.postgresql.org/docs/devel/installation-platform-notes.html#INSTALLATION-NOTES-AIX.)

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

01 апреля 2022 г., 11:10:02

On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Tue, Mar 29, 2022 at 10:43:00AM +0530, Amit Kapila wrote:
> > On Mon, Mar 21, 2022 at 5:51 PM Euler Taveira <euler@eulerto.com> wrote:
> > > On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
> > > I have fixed all the above comments as per your suggestion in the
> > > attached. Do let me know if something is missed?
> > >
> > > Looks good to me.
> >
> > This patch is committed
> > (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=208c5d65bbd60e33e272964578cb74182ac726a8).
>
> src/test/subscription/t/029_on_error.pl has been failing reliably on the five
> AIX buildfarm members:
>
> # poll_query_until timed out executing this query:
> # SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
> # expecting this output:
> # t
> # last actual query output:
> # f
> # with stderr:
> timed out waiting for match: (?^:LOG:  done skipping logical replication transaction finished at 0/1D30788) at
t/029_on_error.plline 50.
 
>
> I've posted five sets of logs (2.7 MiB compressed) here:
> https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing

Thank you for the report. I'm investigating this issue.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

01 апреля 2022 г., 15:25:52

On Fri, Apr 1, 2022 at 5:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Tue, Mar 29, 2022 at 10:43:00AM +0530, Amit Kapila wrote:
> > > On Mon, Mar 21, 2022 at 5:51 PM Euler Taveira <euler@eulerto.com> wrote:
> > > > On Mon, Mar 21, 2022, at 12:25 AM, Amit Kapila wrote:
> > > > I have fixed all the above comments as per your suggestion in the
> > > > attached. Do let me know if something is missed?
> > > >
> > > > Looks good to me.
> > >
> > > This patch is committed
> > > (https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=208c5d65bbd60e33e272964578cb74182ac726a8).
> >
> > src/test/subscription/t/029_on_error.pl has been failing reliably on the five
> > AIX buildfarm members:
> >
> > # poll_query_until timed out executing this query:
> > # SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
> > # expecting this output:
> > # t
> > # last actual query output:
> > # f
> > # with stderr:
> > timed out waiting for match: (?^:LOG:  done skipping logical replication transaction finished at 0/1D30788) at
t/029_on_error.plline 50. 
> >
> > I've posted five sets of logs (2.7 MiB compressed) here:
> > https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing
>
> Thank you for the report. I'm investigating this issue.

Looking at the subscriber logs, it successfully fetched the correct
error-LSN from the server logs and set it to ALTER SUBSCRIPTION …
SKIP:

2022-03-30 09:48:36.617 UTC [17039636:4] CONTEXT:  processing remote
data for replication origin "pg_16391" during "INSERT" for replication
target relation "public.tbl" in transaction 725 finished at 0/1D30788
2022-03-30 09:48:36.617 UTC [17039636:5] LOG:  logical replication
subscription "sub" has been disabled due to an error
:
2022-03-30 09:48:36.670 UTC [17039640:1] [unknown] LOG:  connection
received: host=[local]
2022-03-30 09:48:36.672 UTC [17039640:2] [unknown] LOG:  connection
authorized: user=nm database=postgres application_name=029_on_error.pl
2022-03-30 09:48:36.675 UTC [17039640:3] 029_on_error.pl LOG:
statement: ALTER SUBSCRIPTION sub SKIP (lsn = '0/1D30788')
2022-03-30 09:48:36.676 UTC [17039640:4] 029_on_error.pl LOG:
disconnection: session time: 0:00:00.006 user=nm database=postgres
host=[local]
:
2022-03-30 09:48:36.762 UTC [28246036:2] ERROR:  duplicate key value
violates unique constraint "tbl_pkey"
2022-03-30 09:48:36.762 UTC [28246036:3] DETAIL:  Key (i)=(1) already exists.
2022-03-30 09:48:36.762 UTC [28246036:4] CONTEXT:  processing remote
data for replication origin "pg_16391" during "INSERT" for replication
target relation "public.tbl" in transaction 725 finished at 0/1D30788

However, the worker could not start skipping changes of the error
transaction for some reason. Given that "SELECT subskiplsn = '0/0'
FROM pg_subscription WHERE subname = 'sub’” didn't return true, some
value was set to subskiplsn even after the unique key error.

So I'm guessing that the apply worker could not get the updated value
of the subskiplsn or its MySubscription->skiplsn could not match with
the transaction's finish LSN. Also, given that the test is failing on
all AIX buildfarm members, there might be something specific to AIX.

Noah, to investigate this issue further, is it possible for you to
apply the attached patch and run the 029_on_error.pl test? The patch
adds some logs to get additional information.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

add_logs.patch

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

02 апреля 2022 г., 03:11:53

On Fri, Apr 01, 2022 at 09:25:52PM +0900, Masahiko Sawada wrote:
> > On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah@leadboat.com> wrote:
> > > src/test/subscription/t/029_on_error.pl has been failing reliably on the five
> > > AIX buildfarm members:
> > >
> > > # poll_query_until timed out executing this query:
> > > # SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
> > > # expecting this output:
> > > # t
> > > # last actual query output:
> > > # f
> > > # with stderr:
> > > timed out waiting for match: (?^:LOG:  done skipping logical replication transaction finished at 0/1D30788) at
t/029_on_error.plline 50.
 
> > >
> > > I've posted five sets of logs (2.7 MiB compressed) here:
> > > https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing

> Given that "SELECT subskiplsn = '0/0'
> FROM pg_subscription WHERE subname = 'sub’” didn't return true, some
> value was set to subskiplsn even after the unique key error.
> 
> So I'm guessing that the apply worker could not get the updated value
> of the subskiplsn or its MySubscription->skiplsn could not match with
> the transaction's finish LSN. Also, given that the test is failing on
> all AIX buildfarm members, there might be something specific to AIX.
> 
> Noah, to investigate this issue further, is it possible for you to
> apply the attached patch and run the 029_on_error.pl test? The patch
> adds some logs to get additional information.

Logs attached.  I ran this outside the buildfarm script environment.  Most
notably, I didn't override PG_TEST_TIMEOUT_DEFAULT like my buildfarm
configuration does, so the total log size is smaller.

Вложения

log-subscription-20220401.tar.xz

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

02 апреля 2022 г., 04:19:20

On Sat, Apr 2, 2022 at 5:41 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Fri, Apr 01, 2022 at 09:25:52PM +0900, Masahiko Sawada wrote:
> > > On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah@leadboat.com> wrote:
> > > > src/test/subscription/t/029_on_error.pl has been failing reliably on the five
> > > > AIX buildfarm members:
> > > >
> > > > # poll_query_until timed out executing this query:
> > > > # SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
> > > > # expecting this output:
> > > > # t
> > > > # last actual query output:
> > > > # f
> > > > # with stderr:
> > > > timed out waiting for match: (?^:LOG:  done skipping logical replication transaction finished at 0/1D30788) at
t/029_on_error.plline 50. 
> > > >
> > > > I've posted five sets of logs (2.7 MiB compressed) here:
> > > > https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing
>
> > Given that "SELECT subskiplsn = '0/0'
> > FROM pg_subscription WHERE subname = 'sub’” didn't return true, some
> > value was set to subskiplsn even after the unique key error.
> >
> > So I'm guessing that the apply worker could not get the updated value
> > of the subskiplsn or its MySubscription->skiplsn could not match with
> > the transaction's finish LSN. Also, given that the test is failing on
> > all AIX buildfarm members, there might be something specific to AIX.
> >
> > Noah, to investigate this issue further, is it possible for you to
> > apply the attached patch and run the 029_on_error.pl test? The patch
> > adds some logs to get additional information.
>
> Logs attached.
>

Thank you.

By seeing the below Logs:
----
....
2022-04-01 18:19:34.710 CUT [58327402] LOG:  not started skipping
changes: my_skiplsn 14EB7D8/B0706F72 finish_lsn 0/14EB7D8
...
----

It seems that the value of skiplsn read in GetSubscription is wrong
which makes the apply worker think it doesn't need to skip the
transaction. Now, in Alter/Create Subscription, we are using
LSNGetDatum() to store skiplsn value in pg_subscription but while
reading it in GetSubscription(), we are not converting back the datum
to LSN by using DatumGetLSN(). Is it possible that on this machine it
might be leading to not getting the right value for skiplsn? I think
it is worth trying to see if this fixes the problem.

Any other thoughts?

--
With Regards,
Amit Kapila.

Вложения

datum_to_lsn_skiplsn_1.patch

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

02 апреля 2022 г., 04:59:43

On Sat, Apr 02, 2022 at 06:49:20AM +0530, Amit Kapila wrote:
> On Sat, Apr 2, 2022 at 5:41 AM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Fri, Apr 01, 2022 at 09:25:52PM +0900, Masahiko Sawada wrote:
> > > > On Fri, Apr 1, 2022 at 4:44 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > src/test/subscription/t/029_on_error.pl has been failing reliably on the five
> > > > > AIX buildfarm members:
> > > > >
> > > > > # poll_query_until timed out executing this query:
> > > > > # SELECT subskiplsn = '0/0' FROM pg_subscription WHERE subname = 'sub'
> > > > > # expecting this output:
> > > > > # t
> > > > > # last actual query output:
> > > > > # f
> > > > > # with stderr:
> > > > > timed out waiting for match: (?^:LOG:  done skipping logical replication transaction finished at 0/1D30788)
att/029_on_error.pl line 50.
 
> > > > >
> > > > > I've posted five sets of logs (2.7 MiB compressed) here:
> > > > > https://drive.google.com/file/d/16NkyNIV07o0o8WM7GwcaAYFQDPTkULkR/view?usp=sharing
> >
> > > Given that "SELECT subskiplsn = '0/0'
> > > FROM pg_subscription WHERE subname = 'sub’” didn't return true, some
> > > value was set to subskiplsn even after the unique key error.
> > >
> > > So I'm guessing that the apply worker could not get the updated value
> > > of the subskiplsn or its MySubscription->skiplsn could not match with
> > > the transaction's finish LSN. Also, given that the test is failing on
> > > all AIX buildfarm members, there might be something specific to AIX.
> > >
> > > Noah, to investigate this issue further, is it possible for you to
> > > apply the attached patch and run the 029_on_error.pl test? The patch
> > > adds some logs to get additional information.
> >
> > Logs attached.
> 
> Thank you.
> 
> By seeing the below Logs:
> ----
> ....
> 2022-04-01 18:19:34.710 CUT [58327402] LOG:  not started skipping
> changes: my_skiplsn 14EB7D8/B0706F72 finish_lsn 0/14EB7D8
> ...
> ----
> 
> It seems that the value of skiplsn read in GetSubscription is wrong
> which makes the apply worker think it doesn't need to skip the
> transaction. Now, in Alter/Create Subscription, we are using
> LSNGetDatum() to store skiplsn value in pg_subscription but while
> reading it in GetSubscription(), we are not converting back the datum
> to LSN by using DatumGetLSN(). Is it possible that on this machine it
> might be leading to not getting the right value for skiplsn? I think
> it is worth trying to see if this fixes the problem.

After applying datum_to_lsn_skiplsn_1.patch, I get another failure.  Logs
attached.

Вложения

log-subscription-20220401b.tar.xz

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

02 апреля 2022 г., 07:08:32

On Sat, Apr 2, 2022 at 7:29 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Sat, Apr 02, 2022 at 06:49:20AM +0530, Amit Kapila wrote:
>
> After applying datum_to_lsn_skiplsn_1.patch, I get another failure.  Logs
> attached.
>

The failure is for the same reason. I noticed that even when skip lsn
value should be 0/0, it is some invalid value, see: "LOG:  not started
skipping changes: my_skiplsn 0/B0706F72 finish_lsn 0/14EB7D8". Here,
my_skiplsn should be 0/0 instead of 0/B0706F72. Now, I am not sure why
the LSN's 4 bytes are correct and the other 4 bytes have some random
value. A similar problem is there when we have set the valid value of
skip lsn, see: "LOG:  not started skipping changes: my_skiplsn
14EB7D8/B0706F72 finish_lsn 0/14EB7D8". Here the value of my_skiplsn
should be 0/14EB7D8 instead of 14EB7D8/B0706F72.

I am sure that if you create a subscription with the below test and
check the skip lsn value, it will be correct, otherwise, you would
have seen failure in subscription.sql as well. If possible, can you
please check the following example to rule out the possibility:

For example,
Publisher:
Create table t1(c1 int);
Create Publication pub1 for table t1;

Subscriber:
Create table t1(c1 int);
Create Subscription sub1 connection 'dbname = postgres' Publication pub1;
Select subname, subskiplsn from pg_subsription; -- subskiplsn should be 0/0

Alter Subscription sub1 SKIP (LSN = '0/14EB7D8');
Select subname, subskiplsn from pg_subsription; -- subskiplsn should
be 0/14EB7D8

Assuming the above is correct and we are still getting the wrong value
in apply worker, the only remaining suspect is the following code in
GetSubscription:
sub->skiplsn = DatumGetLSN(subform->subskiplsn);

I don't know what is wrong with this because subskiplsn is stored as
pg_lsn which is a fixed value and we should be able to access it by
struct. Do you see any problem with this?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

02 апреля 2022 г., 10:33:44

On Sat, Apr 2, 2022 at 1:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Apr 2, 2022 at 7:29 AM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Sat, Apr 02, 2022 at 06:49:20AM +0530, Amit Kapila wrote:
> >
> > After applying datum_to_lsn_skiplsn_1.patch, I get another failure.  Logs
> > attached.
> >
>
> The failure is for the same reason. I noticed that even when skip lsn
> value should be 0/0, it is some invalid value, see: "LOG:  not started
> skipping changes: my_skiplsn 0/B0706F72 finish_lsn 0/14EB7D8". Here,
> my_skiplsn should be 0/0 instead of 0/B0706F72. Now, I am not sure why
> the LSN's 4 bytes are correct and the other 4 bytes have some random
> value.

It seems that 0/B0706F72 is not a random value. Two subscriber logs
show the same value. Since 0x70 = 'p', 0x6F = 'o', and 0x72 = 'r', it
might show the next field in the pg_subscription catalog, i.e.,
subconninfo. The subscription is created by "CREATE SUBSCRIPTION sub
CONNECTION 'port=57851 host=/tmp/6u2vRwQYik dbname=postgres'
PUBLICATION pub WITH (disable_on_error = true, streaming = on,
two_phase = on)".

Given subscription.sql passes, something is wrong when we read the
subskiplsn value by like "sub->skiplsn = subform->subskiplsn;".

Is it possible to run the test again with the attached patch?

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

add_logs_v2.patch

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

02 апреля 2022 г., 11:13:46

On Sat, Apr 02, 2022 at 04:33:44PM +0900, Masahiko Sawada wrote:
> It seems that 0/B0706F72 is not a random value. Two subscriber logs
> show the same value. Since 0x70 = 'p', 0x6F = 'o', and 0x72 = 'r', it
> might show the next field in the pg_subscription catalog, i.e.,
> subconninfo. The subscription is created by "CREATE SUBSCRIPTION sub
> CONNECTION 'port=57851 host=/tmp/6u2vRwQYik dbname=postgres'
> PUBLICATION pub WITH (disable_on_error = true, streaming = on,
> two_phase = on)".
> 
> Given subscription.sql passes, something is wrong when we read the
> subskiplsn value by like "sub->skiplsn = subform->subskiplsn;".

That's a good clue.  We've never made pg_type.typalign able to represent
alignment as it works on AIX.  A uint64 like pg_lsn has 8-byte alignment, so
the C struct follows from that.  At the typalign level, we have only these:

#define  TYPALIGN_CHAR            'c' /* char alignment (i.e. unaligned) */
#define  TYPALIGN_SHORT            's' /* short alignment (typically 2 bytes) */
#define  TYPALIGN_INT            'i' /* int alignment (typically 4 bytes) */
#define  TYPALIGN_DOUBLE        'd' /* double alignment (often 8 bytes) */

On AIX, they are:

#define ALIGNOF_DOUBLE 4 
#define ALIGNOF_INT 4
#define ALIGNOF_LONG 8   
/* #undef ALIGNOF_LONG_LONG_INT */
/* #undef ALIGNOF_PG_INT128_TYPE */
#define ALIGNOF_SHORT 2  

uint64 and pg_lsn use TYPALIGN_DOUBLE.  For AIX, they really need a typalign
corresponding to ALIGNOF_LONG.  Hence, the C struct layout doesn't match the
tuple layout.  Columns potentially affected:

[local] test=*# select attrelid::regclass, attname from pg_attribute a join pg_class c on c.oid = attrelid where
attalign= 'd' and relkind = 'r' and attnotnull and attlen <> -1;
 
    attrelid     │   attname    
─────────────────┼──────────────
 pg_sequence     │ seqstart
 pg_sequence     │ seqincrement
 pg_sequence     │ seqmax
 pg_sequence     │ seqmin
 pg_sequence     │ seqcache
 pg_subscription │ subskiplsn
(6 rows)

The pg_sequence fields evade trouble, because there's exactly eight bytes (two
oids) before them.


Some options:
- Move subskiplsn after subdbid, so it's always aligned anyway.  I've
  confirmed that this lets the test pass, in 44s.
- Move subskiplsn to the CATALOG_VARLEN section, despite its fixed length.
- Introduce a new typalign value suitable for uint64.  This is more intrusive,
  but it's more future-proof.  Looking beyond catalog columns, it might
  improve performance by avoiding unaligned reads.

> Is it possible to run the test again with the attached patch?

Logs attached.  The test "passed", though it printed "poll_query_until timed
out" three times and took awhile.

Вложения

log-subscription-20220401c.tar.xz

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

02 апреля 2022 г., 13:04:01

On Sat, Apr 2, 2022 at 1:43 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Sat, Apr 02, 2022 at 04:33:44PM +0900, Masahiko Sawada wrote:
> > It seems that 0/B0706F72 is not a random value. Two subscriber logs
> > show the same value. Since 0x70 = 'p', 0x6F = 'o', and 0x72 = 'r', it
> > might show the next field in the pg_subscription catalog, i.e.,
> > subconninfo. The subscription is created by "CREATE SUBSCRIPTION sub
> > CONNECTION 'port=57851 host=/tmp/6u2vRwQYik dbname=postgres'
> > PUBLICATION pub WITH (disable_on_error = true, streaming = on,
> > two_phase = on)".
> >
> > Given subscription.sql passes, something is wrong when we read the
> > subskiplsn value by like "sub->skiplsn = subform->subskiplsn;".
>
> That's a good clue.  We've never made pg_type.typalign able to represent
> alignment as it works on AIX.  A uint64 like pg_lsn has 8-byte alignment, so
> the C struct follows from that.  At the typalign level, we have only these:
>
> #define  TYPALIGN_CHAR                  'c' /* char alignment (i.e. unaligned) */
> #define  TYPALIGN_SHORT                 's' /* short alignment (typically 2 bytes) */
> #define  TYPALIGN_INT                   'i' /* int alignment (typically 4 bytes) */
> #define  TYPALIGN_DOUBLE                'd' /* double alignment (often 8 bytes) */
>
> On AIX, they are:
>
> #define ALIGNOF_DOUBLE 4
> #define ALIGNOF_INT 4
> #define ALIGNOF_LONG 8
> /* #undef ALIGNOF_LONG_LONG_INT */
> /* #undef ALIGNOF_PG_INT128_TYPE */
> #define ALIGNOF_SHORT 2
>
> uint64 and pg_lsn use TYPALIGN_DOUBLE.  For AIX, they really need a typalign
> corresponding to ALIGNOF_LONG.  Hence, the C struct layout doesn't match the
> tuple layout.  Columns potentially affected:
>
> [local] test=*# select attrelid::regclass, attname from pg_attribute a join pg_class c on c.oid = attrelid where
attalign= 'd' and relkind = 'r' and attnotnull and attlen <> -1; 
>     attrelid     │   attname
> ─────────────────┼──────────────
>  pg_sequence     │ seqstart
>  pg_sequence     │ seqincrement
>  pg_sequence     │ seqmax
>  pg_sequence     │ seqmin
>  pg_sequence     │ seqcache
>  pg_subscription │ subskiplsn
> (6 rows)
>
> The pg_sequence fields evade trouble, because there's exactly eight bytes (two
> oids) before them.
>
>
> Some options:
> - Move subskiplsn after subdbid, so it's always aligned anyway.  I've
>   confirmed that this lets the test pass, in 44s.
> - Move subskiplsn to the CATALOG_VARLEN section, despite its fixed length.
>

+1 to any one of the above. I mildly prefer the first option as that
will allow us to access the value directly instead of going via
SysCacheGetAttr but I am fine either way.

> - Introduce a new typalign value suitable for uint64.  This is more intrusive,
>   but it's more future-proof.  Looking beyond catalog columns, it might
>   improve performance by avoiding unaligned reads.
>
> > Is it possible to run the test again with the attached patch?
>
> Logs attached.  The test "passed", though it printed "poll_query_until timed
> out" three times and took awhile.

Thanks for helping in figuring out the problem.

--
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

02 апреля 2022 г., 14:44:45

On Sat, Apr 2, 2022 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Apr 2, 2022 at 1:43 PM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Sat, Apr 02, 2022 at 04:33:44PM +0900, Masahiko Sawada wrote:
> > > It seems that 0/B0706F72 is not a random value. Two subscriber logs
> > > show the same value. Since 0x70 = 'p', 0x6F = 'o', and 0x72 = 'r', it
> > > might show the next field in the pg_subscription catalog, i.e.,
> > > subconninfo. The subscription is created by "CREATE SUBSCRIPTION sub
> > > CONNECTION 'port=57851 host=/tmp/6u2vRwQYik dbname=postgres'
> > > PUBLICATION pub WITH (disable_on_error = true, streaming = on,
> > > two_phase = on)".
> > >
> > > Given subscription.sql passes, something is wrong when we read the
> > > subskiplsn value by like "sub->skiplsn = subform->subskiplsn;".
> >
> > That's a good clue.  We've never made pg_type.typalign able to represent
> > alignment as it works on AIX.  A uint64 like pg_lsn has 8-byte alignment, so
> > the C struct follows from that.  At the typalign level, we have only these:
> >
> > #define  TYPALIGN_CHAR                  'c' /* char alignment (i.e. unaligned) */
> > #define  TYPALIGN_SHORT                 's' /* short alignment (typically 2 bytes) */
> > #define  TYPALIGN_INT                   'i' /* int alignment (typically 4 bytes) */
> > #define  TYPALIGN_DOUBLE                'd' /* double alignment (often 8 bytes) */
> >
> > On AIX, they are:
> >
> > #define ALIGNOF_DOUBLE 4
> > #define ALIGNOF_INT 4
> > #define ALIGNOF_LONG 8
> > /* #undef ALIGNOF_LONG_LONG_INT */
> > /* #undef ALIGNOF_PG_INT128_TYPE */
> > #define ALIGNOF_SHORT 2
> >
> > uint64 and pg_lsn use TYPALIGN_DOUBLE.  For AIX, they really need a typalign
> > corresponding to ALIGNOF_LONG.  Hence, the C struct layout doesn't match the
> > tuple layout.  Columns potentially affected:
> >
> > [local] test=*# select attrelid::regclass, attname from pg_attribute a join pg_class c on c.oid = attrelid where
attalign= 'd' and relkind = 'r' and attnotnull and attlen <> -1; 
> >     attrelid     │   attname
> > ─────────────────┼──────────────
> >  pg_sequence     │ seqstart
> >  pg_sequence     │ seqincrement
> >  pg_sequence     │ seqmax
> >  pg_sequence     │ seqmin
> >  pg_sequence     │ seqcache
> >  pg_subscription │ subskiplsn
> > (6 rows)
> >
> > The pg_sequence fields evade trouble, because there's exactly eight bytes (two
> > oids) before them.

Thanks for helping with the investigation!

> >
> >
> > Some options:
> > - Move subskiplsn after subdbid, so it's always aligned anyway.  I've
> >   confirmed that this lets the test pass, in 44s.
> > - Move subskiplsn to the CATALOG_VARLEN section, despite its fixed length.
> >
>
> +1 to any one of the above. I mildly prefer the first option as that
> will allow us to access the value directly instead of going via
> SysCacheGetAttr but I am fine either way.

+1. I also prefer the first option.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

03 апреля 2022 г., 03:45:55

On Sat, Apr 02, 2022 at 08:44:45PM +0900, Masahiko Sawada wrote:
> On Sat, Apr 2, 2022 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Sat, Apr 2, 2022 at 1:43 PM Noah Misch <noah@leadboat.com> wrote:
> > > Some options:
> > > - Move subskiplsn after subdbid, so it's always aligned anyway.  I've
> > >   confirmed that this lets the test pass, in 44s.
> > > - Move subskiplsn to the CATALOG_VARLEN section, despite its fixed length.
> >
> > +1 to any one of the above. I mildly prefer the first option as that
> > will allow us to access the value directly instead of going via
> > SysCacheGetAttr but I am fine either way.
> 
> +1. I also prefer the first option.

Sounds good to me.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

04 апреля 2022 г., 04:28:30

On Sun, Apr 3, 2022 at 9:45 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Sat, Apr 02, 2022 at 08:44:45PM +0900, Masahiko Sawada wrote:
> > On Sat, Apr 2, 2022 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > On Sat, Apr 2, 2022 at 1:43 PM Noah Misch <noah@leadboat.com> wrote:
> > > > Some options:
> > > > - Move subskiplsn after subdbid, so it's always aligned anyway.  I've
> > > >   confirmed that this lets the test pass, in 44s.
> > > > - Move subskiplsn to the CATALOG_VARLEN section, despite its fixed length.
> > >
> > > +1 to any one of the above. I mildly prefer the first option as that
> > > will allow us to access the value directly instead of going via
> > > SysCacheGetAttr but I am fine either way.
> >
> > +1. I also prefer the first option.
>
> Sounds good to me.

I've attached the patch for the first option.

> - Introduce a new typalign value suitable for uint64.  This is more intrusive,
>   but it's more future-proof.  Looking beyond catalog columns, it might
>   improve performance by avoiding unaligned reads.

The third option would be a good item for PG16 or later.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

make_subskiplsn_aligned.patch

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

04 апреля 2022 г., 05:31:28

On Mon, Apr 04, 2022 at 10:28:30AM +0900, Masahiko Sawada wrote:
> On Sun, Apr 3, 2022 at 9:45 AM Noah Misch <noah@leadboat.com> wrote:
> > On Sat, Apr 02, 2022 at 08:44:45PM +0900, Masahiko Sawada wrote:
> > > On Sat, Apr 2, 2022 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > On Sat, Apr 2, 2022 at 1:43 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > Some options:
> > > > > - Move subskiplsn after subdbid, so it's always aligned anyway.  I've
> > > > >   confirmed that this lets the test pass, in 44s.

> --- a/src/include/catalog/pg_subscription.h
> +++ b/src/include/catalog/pg_subscription.h
> @@ -54,6 +54,17 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
>  
>      Oid            subdbid BKI_LOOKUP(pg_database);    /* Database the
>                                                       * subscription is in. */
> +
> +    /*
> +     * All changes finished at this LSN are skipped.
> +     *
> +     * Note that XLogRecPtr, pg_lsn in the catalog, is 8-byte alignment
> +     * (TYPALIGN_DOUBLE) and it does not match the alignment on some platforms
> +     * such as AIX.  Therefore subskiplsn needs to be placed here so it is
> +     * always aligned.

I'm reading this comment as saying that TYPALIGN_DOUBLE is always 8 bytes, but
the problem arises precisely because TYPALIGN_DOUBLE==4 on AIX.

On most hosts, the C alignment of an XLogRecPtr is 8 bytes, and
TYPALIGN_DOUBLE==8.  On AIX, C alignment is still 8 bytes, but
TYPALIGN_DOUBLE==4.  The tuples on disk and in shared buffers use
TYPALIGN_DOUBLE to decide how much padding to insert, and that amount of
padding needs to match the C alignment padding.  Placing the field here
reduces the padding to zero, making that invariant hold trivially.

> +     */
> +    XLogRecPtr    subskiplsn;
> +
>      NameData    subname;        /* Name of the subscription */
>  
>      Oid            subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
> @@ -71,9 +82,6 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
>      bool        subdisableonerr;    /* True if a worker error should cause the
>                                       * subscription to be disabled */
>  
> -    XLogRecPtr    subskiplsn;        /* All changes finished at this LSN are
> -                                 * skipped */

Some code sites list pg_subscription fields in field order.  Please update
them so they continue to list fields in field order.  CreateSubscription() is
one example.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

04 апреля 2022 г., 05:50:08

On Mon, Apr 4, 2022 at 8:01 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Mon, Apr 04, 2022 at 10:28:30AM +0900, Masahiko Sawada wrote:
> > On Sun, Apr 3, 2022 at 9:45 AM Noah Misch <noah@leadboat.com> wrote:
> > > On Sat, Apr 02, 2022 at 08:44:45PM +0900, Masahiko Sawada wrote:
> > > > On Sat, Apr 2, 2022 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > On Sat, Apr 2, 2022 at 1:43 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > > Some options:
> > > > > > - Move subskiplsn after subdbid, so it's always aligned anyway.  I've
> > > > > >   confirmed that this lets the test pass, in 44s.
>
> > --- a/src/include/catalog/pg_subscription.h
> > +++ b/src/include/catalog/pg_subscription.h
> > @@ -54,6 +54,17 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
> >
> >       Oid                     subdbid BKI_LOOKUP(pg_database);        /* Database the
> >                                                                                                        *
subscriptionis in. */
 
> > +
> > +     /*
> > +      * All changes finished at this LSN are skipped.
> > +      *
> > +      * Note that XLogRecPtr, pg_lsn in the catalog, is 8-byte alignment
> > +      * (TYPALIGN_DOUBLE) and it does not match the alignment on some platforms
> > +      * such as AIX.  Therefore subskiplsn needs to be placed here so it is
> > +      * always aligned.
>
> I'm reading this comment as saying that TYPALIGN_DOUBLE is always 8 bytes, but
> the problem arises precisely because TYPALIGN_DOUBLE==4 on AIX.
>

How about a comment like: "It has to be kept at 8-byte alignment
boundary so as to be accessed directly via C struct as it uses
TYPALIGN_DOUBLE for storage which has 4-byte alignment on platforms
like AIX."? Can you please suggest a better comment if you don't like
this one?

> > +      */
> > +     XLogRecPtr      subskiplsn;
> > +
> >       NameData        subname;                /* Name of the subscription */
> >
> >       Oid                     subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
> > @@ -71,9 +82,6 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
> >       bool            subdisableonerr;        /* True if a worker error should cause the
> >                                                                        * subscription to be disabled */
> >
> > -     XLogRecPtr      subskiplsn;             /* All changes finished at this LSN are
> > -                                                              * skipped */
>
> Some code sites list pg_subscription fields in field order.  Please update
> them so they continue to list fields in field order.  CreateSubscription() is
> one example.
>

Another minor point is that I think it is better to use DatumGetLSN to
read this in GetSubscription as we use LSNGetDatum while storing it. I
am not sure if there is any direct problem due to this but that looks
consistent to me.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

04 апреля 2022 г., 06:10:28

On Mon, Apr 4, 2022 at 11:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 4, 2022 at 8:01 AM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Mon, Apr 04, 2022 at 10:28:30AM +0900, Masahiko Sawada wrote:
> > > On Sun, Apr 3, 2022 at 9:45 AM Noah Misch <noah@leadboat.com> wrote:
> > > > On Sat, Apr 02, 2022 at 08:44:45PM +0900, Masahiko Sawada wrote:
> > > > > On Sat, Apr 2, 2022 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > On Sat, Apr 2, 2022 at 1:43 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > > > Some options:
> > > > > > > - Move subskiplsn after subdbid, so it's always aligned anyway.  I've
> > > > > > >   confirmed that this lets the test pass, in 44s.
> >
> > > --- a/src/include/catalog/pg_subscription.h
> > > +++ b/src/include/catalog/pg_subscription.h
> > > @@ -54,6 +54,17 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
> > >
> > >       Oid                     subdbid BKI_LOOKUP(pg_database);        /* Database the
> > >                                                                                                        *
subscriptionis in. */
 
> > > +
> > > +     /*
> > > +      * All changes finished at this LSN are skipped.
> > > +      *
> > > +      * Note that XLogRecPtr, pg_lsn in the catalog, is 8-byte alignment
> > > +      * (TYPALIGN_DOUBLE) and it does not match the alignment on some platforms
> > > +      * such as AIX.  Therefore subskiplsn needs to be placed here so it is
> > > +      * always aligned.
> >
> > I'm reading this comment as saying that TYPALIGN_DOUBLE is always 8 bytes, but
> > the problem arises precisely because TYPALIGN_DOUBLE==4 on AIX.
> >
>
> How about a comment like: "It has to be kept at 8-byte alignment
> boundary so as to be accessed directly via C struct as it uses
> TYPALIGN_DOUBLE for storage which has 4-byte alignment on platforms
> like AIX."? Can you please suggest a better comment if you don't like
> this one?
>
> > > +      */
> > > +     XLogRecPtr      subskiplsn;
> > > +
> > >       NameData        subname;                /* Name of the subscription */
> > >
> > >       Oid                     subowner BKI_LOOKUP(pg_authid); /* Owner of the subscription */
> > > @@ -71,9 +82,6 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
> > >       bool            subdisableonerr;        /* True if a worker error should cause the
> > >                                                                        * subscription to be disabled */
> > >
> > > -     XLogRecPtr      subskiplsn;             /* All changes finished at this LSN are
> > > -                                                              * skipped */
> >
> > Some code sites list pg_subscription fields in field order.  Please update
> > them so they continue to list fields in field order.  CreateSubscription() is
> > one example.
> >
>
> Another minor point is that I think it is better to use DatumGetLSN to
> read this in GetSubscription as we use LSNGetDatum while storing it. I
> am not sure if there is any direct problem due to this but that looks
> consistent to me.

But it seems not consistent with other usages since we don't normally
use DatumGetXXX to get values directly from C struct.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

04 апреля 2022 г., 06:32:37

On Mon, Apr 4, 2022 at 8:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Apr 4, 2022 at 11:50 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Another minor point is that I think it is better to use DatumGetLSN to
> > read this in GetSubscription as we use LSNGetDatum while storing it. I
> > am not sure if there is any direct problem due to this but that looks
> > consistent to me.
>
> But it seems not consistent with other usages since we don't normally
> use DatumGetXXX to get values directly from C struct.
>

Okay, I see that for sequences also we don't use it, so we can
probably leave it as it is.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

04 апреля 2022 г., 09:26:20

On Mon, Apr 04, 2022 at 08:20:08AM +0530, Amit Kapila wrote:
> On Mon, Apr 4, 2022 at 8:01 AM Noah Misch <noah@leadboat.com> wrote:
> > On Mon, Apr 04, 2022 at 10:28:30AM +0900, Masahiko Sawada wrote:
> > > On Sun, Apr 3, 2022 at 9:45 AM Noah Misch <noah@leadboat.com> wrote:
> > > > On Sat, Apr 02, 2022 at 08:44:45PM +0900, Masahiko Sawada wrote:
> > > > > On Sat, Apr 2, 2022 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > On Sat, Apr 2, 2022 at 1:43 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > > > Some options:
> > > > > > > - Move subskiplsn after subdbid, so it's always aligned anyway.  I've
> > > > > > >   confirmed that this lets the test pass, in 44s.
> >
> > > --- a/src/include/catalog/pg_subscription.h
> > > +++ b/src/include/catalog/pg_subscription.h
> > > @@ -54,6 +54,17 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
> > >
> > >       Oid                     subdbid BKI_LOOKUP(pg_database);        /* Database the
> > >                                                                                                        *
subscriptionis in. */
 
> > > +
> > > +     /*
> > > +      * All changes finished at this LSN are skipped.
> > > +      *
> > > +      * Note that XLogRecPtr, pg_lsn in the catalog, is 8-byte alignment
> > > +      * (TYPALIGN_DOUBLE) and it does not match the alignment on some platforms
> > > +      * such as AIX.  Therefore subskiplsn needs to be placed here so it is
> > > +      * always aligned.
> >
> > I'm reading this comment as saying that TYPALIGN_DOUBLE is always 8 bytes, but
> > the problem arises precisely because TYPALIGN_DOUBLE==4 on AIX.
> 
> How about a comment like: "It has to be kept at 8-byte alignment
> boundary so as to be accessed directly via C struct as it uses
> TYPALIGN_DOUBLE for storage which has 4-byte alignment on platforms
> like AIX."? Can you please suggest a better comment if you don't like
> this one?

I'd write it like this, though I'm not sure it's an improvement on your words:

  When ALIGNOF_DOUBLE==4 (e.g. AIX), the C ABI may impose 8-byte alignment on
  some of the C types that correspond to TYPALIGN_DOUBLE SQL types.  To ensure
  catalog C struct layout matches catalog tuple layout, arrange for the tuple
  offset of each fixed-width, attalign='d' catalog column to be divisible by 8
  unconditionally.  Keep such columns before the first NameData column of the
  catalog, since packagers can override NAMEDATALEN to an odd number.

The best place for such a comment would be in one of
src/test/regress/sql/*sanity*.sql, next to a test written to detect new
violations.  If adding such a test would materially delay getting the
buildfarm green, putting the comment in pg_subscription.h works for me.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

04 апреля 2022 г., 12:55:45

On Mon, Apr 4, 2022 at 3:26 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Mon, Apr 04, 2022 at 08:20:08AM +0530, Amit Kapila wrote:
> > On Mon, Apr 4, 2022 at 8:01 AM Noah Misch <noah@leadboat.com> wrote:
> > > On Mon, Apr 04, 2022 at 10:28:30AM +0900, Masahiko Sawada wrote:
> > > > On Sun, Apr 3, 2022 at 9:45 AM Noah Misch <noah@leadboat.com> wrote:
> > > > > On Sat, Apr 02, 2022 at 08:44:45PM +0900, Masahiko Sawada wrote:
> > > > > > On Sat, Apr 2, 2022 at 7:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > > On Sat, Apr 2, 2022 at 1:43 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > > > > Some options:
> > > > > > > > - Move subskiplsn after subdbid, so it's always aligned anyway.  I've
> > > > > > > >   confirmed that this lets the test pass, in 44s.
> > >
> > > > --- a/src/include/catalog/pg_subscription.h
> > > > +++ b/src/include/catalog/pg_subscription.h
> > > > @@ -54,6 +54,17 @@ CATALOG(pg_subscription,6100,SubscriptionRelationId) BKI_SHARED_RELATION BKI_ROW
> > > >
> > > >       Oid                     subdbid BKI_LOOKUP(pg_database);        /* Database the
> > > >                                                                                                        *
subscriptionis in. */
 
> > > > +
> > > > +     /*
> > > > +      * All changes finished at this LSN are skipped.
> > > > +      *
> > > > +      * Note that XLogRecPtr, pg_lsn in the catalog, is 8-byte alignment
> > > > +      * (TYPALIGN_DOUBLE) and it does not match the alignment on some platforms
> > > > +      * such as AIX.  Therefore subskiplsn needs to be placed here so it is
> > > > +      * always aligned.
> > >
> > > I'm reading this comment as saying that TYPALIGN_DOUBLE is always 8 bytes, but
> > > the problem arises precisely because TYPALIGN_DOUBLE==4 on AIX.
> >
> > How about a comment like: "It has to be kept at 8-byte alignment
> > boundary so as to be accessed directly via C struct as it uses
> > TYPALIGN_DOUBLE for storage which has 4-byte alignment on platforms
> > like AIX."? Can you please suggest a better comment if you don't like
> > this one?
>
> I'd write it like this, though I'm not sure it's an improvement on your words:
>
>   When ALIGNOF_DOUBLE==4 (e.g. AIX), the C ABI may impose 8-byte alignment on
>   some of the C types that correspond to TYPALIGN_DOUBLE SQL types.  To ensure
>   catalog C struct layout matches catalog tuple layout, arrange for the tuple
>   offset of each fixed-width, attalign='d' catalog column to be divisible by 8
>   unconditionally.  Keep such columns before the first NameData column of the
>   catalog, since packagers can override NAMEDATALEN to an odd number.

Thanks!

>
> The best place for such a comment would be in one of
> src/test/regress/sql/*sanity*.sql, next to a test written to detect new
> violations.

Agreed.

IIUC in the new test, we would need a new SQL function to calculate
the offset of catalog columns including padding, is that right? Or do
you have an idea to do that by using existing functionality?

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

05 апреля 2022 г., 03:21:07

On Mon, Apr 04, 2022 at 06:55:45PM +0900, Masahiko Sawada wrote:
> On Mon, Apr 4, 2022 at 3:26 PM Noah Misch <noah@leadboat.com> wrote:
> > On Mon, Apr 04, 2022 at 08:20:08AM +0530, Amit Kapila wrote:
> > > How about a comment like: "It has to be kept at 8-byte alignment
> > > boundary so as to be accessed directly via C struct as it uses
> > > TYPALIGN_DOUBLE for storage which has 4-byte alignment on platforms
> > > like AIX."? Can you please suggest a better comment if you don't like
> > > this one?
> >
> > I'd write it like this, though I'm not sure it's an improvement on your words:
> >
> >   When ALIGNOF_DOUBLE==4 (e.g. AIX), the C ABI may impose 8-byte alignment on
> >   some of the C types that correspond to TYPALIGN_DOUBLE SQL types.  To ensure
> >   catalog C struct layout matches catalog tuple layout, arrange for the tuple
> >   offset of each fixed-width, attalign='d' catalog column to be divisible by 8
> >   unconditionally.  Keep such columns before the first NameData column of the
> >   catalog, since packagers can override NAMEDATALEN to an odd number.
> 
> Thanks!
> 
> >
> > The best place for such a comment would be in one of
> > src/test/regress/sql/*sanity*.sql, next to a test written to detect new
> > violations.
> 
> Agreed.
> 
> IIUC in the new test, we would need a new SQL function to calculate
> the offset of catalog columns including padding, is that right? Or do
> you have an idea to do that by using existing functionality?

Something like this:

select
  attrelid::regclass,
  attname,
  array(select typname
        from pg_type t join pg_attribute pa on t.oid = pa.atttypid
        where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum order by pa.attnum) AS types_before,
  (select sum(attlen)
   from pg_type t join pg_attribute pa on t.oid = pa.atttypid
   where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum) AS len_before
from pg_attribute a
join pg_class c on c.oid = attrelid
where attalign = 'd' and relkind = 'r' and attnotnull and attlen <> -1
order by attrelid::regclass::text, attnum;
    attrelid     │   attname    │                types_before                 │ len_before
─────────────────┼──────────────┼─────────────────────────────────────────────┼────────────
 pg_sequence     │ seqstart     │ {oid,oid}                                   │          8
 pg_sequence     │ seqincrement │ {oid,oid,int8}                              │         16
 pg_sequence     │ seqmax       │ {oid,oid,int8,int8}                         │         24
 pg_sequence     │ seqmin       │ {oid,oid,int8,int8,int8}                    │         32
 pg_sequence     │ seqcache     │ {oid,oid,int8,int8,int8,int8}               │         40
 pg_subscription │ subskiplsn   │ {oid,oid,name,oid,bool,bool,bool,char,bool} │         81
(6 rows)

That doesn't count padding, but hazardous column changes will cause a diff in
the output.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 апреля 2022 г., 04:13:06

On Tue, Apr 5, 2022 at 9:21 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Mon, Apr 04, 2022 at 06:55:45PM +0900, Masahiko Sawada wrote:
> > On Mon, Apr 4, 2022 at 3:26 PM Noah Misch <noah@leadboat.com> wrote:
> > > On Mon, Apr 04, 2022 at 08:20:08AM +0530, Amit Kapila wrote:
> > > > How about a comment like: "It has to be kept at 8-byte alignment
> > > > boundary so as to be accessed directly via C struct as it uses
> > > > TYPALIGN_DOUBLE for storage which has 4-byte alignment on platforms
> > > > like AIX."? Can you please suggest a better comment if you don't like
> > > > this one?
> > >
> > > I'd write it like this, though I'm not sure it's an improvement on your words:
> > >
> > >   When ALIGNOF_DOUBLE==4 (e.g. AIX), the C ABI may impose 8-byte alignment on
> > >   some of the C types that correspond to TYPALIGN_DOUBLE SQL types.  To ensure
> > >   catalog C struct layout matches catalog tuple layout, arrange for the tuple
> > >   offset of each fixed-width, attalign='d' catalog column to be divisible by 8
> > >   unconditionally.  Keep such columns before the first NameData column of the
> > >   catalog, since packagers can override NAMEDATALEN to an odd number.
> >
> > Thanks!
> >
> > >
> > > The best place for such a comment would be in one of
> > > src/test/regress/sql/*sanity*.sql, next to a test written to detect new
> > > violations.
> >
> > Agreed.
> >
> > IIUC in the new test, we would need a new SQL function to calculate
> > the offset of catalog columns including padding, is that right? Or do
> > you have an idea to do that by using existing functionality?
>
> Something like this:
>
> select
>   attrelid::regclass,
>   attname,
>   array(select typname
>         from pg_type t join pg_attribute pa on t.oid = pa.atttypid
>         where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum order by pa.attnum) AS
types_before,
>   (select sum(attlen)
>    from pg_type t join pg_attribute pa on t.oid = pa.atttypid
>    where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum) AS len_before
> from pg_attribute a
> join pg_class c on c.oid = attrelid
> where attalign = 'd' and relkind = 'r' and attnotnull and attlen <> -1
> order by attrelid::regclass::text, attnum;
>     attrelid     │   attname    │                types_before                 │ len_before
> ─────────────────┼──────────────┼─────────────────────────────────────────────┼────────────
>  pg_sequence     │ seqstart     │ {oid,oid}                                   │          8
>  pg_sequence     │ seqincrement │ {oid,oid,int8}                              │         16
>  pg_sequence     │ seqmax       │ {oid,oid,int8,int8}                         │         24
>  pg_sequence     │ seqmin       │ {oid,oid,int8,int8,int8}                    │         32
>  pg_sequence     │ seqcache     │ {oid,oid,int8,int8,int8,int8}               │         40
>  pg_subscription │ subskiplsn   │ {oid,oid,name,oid,bool,bool,bool,char,bool} │         81
> (6 rows)
>
> That doesn't count padding, but hazardous column changes will cause a diff in
> the output.

Yes, in this case, we can detect the violated column order even
without considering padding. On the other hand, I think this
calculation could not detect some patterns of order. For instance,
suppose the column order is {oid, bool, bool, oid, bool, bool, oid,
int8}, the len_before is 16 but offset of int8 column including
padding is 20 on ALIGNOF_DOUBLE==4 environment.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

05 апреля 2022 г., 04:46:20

On Tue, Apr 05, 2022 at 10:13:06AM +0900, Masahiko Sawada wrote:
> On Tue, Apr 5, 2022 at 9:21 AM Noah Misch <noah@leadboat.com> wrote:
> > On Mon, Apr 04, 2022 at 06:55:45PM +0900, Masahiko Sawada wrote:
> > > On Mon, Apr 4, 2022 at 3:26 PM Noah Misch <noah@leadboat.com> wrote:
> > > > On Mon, Apr 04, 2022 at 08:20:08AM +0530, Amit Kapila wrote:
> > > > > How about a comment like: "It has to be kept at 8-byte alignment
> > > > > boundary so as to be accessed directly via C struct as it uses
> > > > > TYPALIGN_DOUBLE for storage which has 4-byte alignment on platforms
> > > > > like AIX."? Can you please suggest a better comment if you don't like
> > > > > this one?
> > > >
> > > > I'd write it like this, though I'm not sure it's an improvement on your words:
> > > >
> > > >   When ALIGNOF_DOUBLE==4 (e.g. AIX), the C ABI may impose 8-byte alignment on
> > > >   some of the C types that correspond to TYPALIGN_DOUBLE SQL types.  To ensure
> > > >   catalog C struct layout matches catalog tuple layout, arrange for the tuple
> > > >   offset of each fixed-width, attalign='d' catalog column to be divisible by 8
> > > >   unconditionally.  Keep such columns before the first NameData column of the
> > > >   catalog, since packagers can override NAMEDATALEN to an odd number.
> > >
> > > Thanks!
> > >
> > > >
> > > > The best place for such a comment would be in one of
> > > > src/test/regress/sql/*sanity*.sql, next to a test written to detect new
> > > > violations.
> > >
> > > Agreed.
> > >
> > > IIUC in the new test, we would need a new SQL function to calculate
> > > the offset of catalog columns including padding, is that right? Or do
> > > you have an idea to do that by using existing functionality?
> >
> > Something like this:
> >
> > select
> >   attrelid::regclass,
> >   attname,
> >   array(select typname
> >         from pg_type t join pg_attribute pa on t.oid = pa.atttypid
> >         where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum order by pa.attnum) AS
types_before,
> >   (select sum(attlen)
> >    from pg_type t join pg_attribute pa on t.oid = pa.atttypid
> >    where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum) AS len_before
> > from pg_attribute a
> > join pg_class c on c.oid = attrelid
> > where attalign = 'd' and relkind = 'r' and attnotnull and attlen <> -1
> > order by attrelid::regclass::text, attnum;
> >     attrelid     │   attname    │                types_before                 │ len_before
> > ─────────────────┼──────────────┼─────────────────────────────────────────────┼────────────
> >  pg_sequence     │ seqstart     │ {oid,oid}                                   │          8
> >  pg_sequence     │ seqincrement │ {oid,oid,int8}                              │         16
> >  pg_sequence     │ seqmax       │ {oid,oid,int8,int8}                         │         24
> >  pg_sequence     │ seqmin       │ {oid,oid,int8,int8,int8}                    │         32
> >  pg_sequence     │ seqcache     │ {oid,oid,int8,int8,int8,int8}               │         40
> >  pg_subscription │ subskiplsn   │ {oid,oid,name,oid,bool,bool,bool,char,bool} │         81
> > (6 rows)
> >
> > That doesn't count padding, but hazardous column changes will cause a diff in
> > the output.
> 
> Yes, in this case, we can detect the violated column order even
> without considering padding. On the other hand, I think this
> calculation could not detect some patterns of order. For instance,
> suppose the column order is {oid, bool, bool, oid, bool, bool, oid,
> int8}, the len_before is 16 but offset of int8 column including
> padding is 20 on ALIGNOF_DOUBLE==4 environment.

Correct.  Feel free to make it more precise.  If you do want to add a
function, it could be a regress.c function rather than an always-installed
part of PostgreSQL.  Again, getting the buildfarm green is a priority; we can
always add tests later.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 апреля 2022 г., 06:38:49

On Tue, Apr 5, 2022 at 10:46 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Tue, Apr 05, 2022 at 10:13:06AM +0900, Masahiko Sawada wrote:
> > On Tue, Apr 5, 2022 at 9:21 AM Noah Misch <noah@leadboat.com> wrote:
> > > On Mon, Apr 04, 2022 at 06:55:45PM +0900, Masahiko Sawada wrote:
> > > > On Mon, Apr 4, 2022 at 3:26 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > On Mon, Apr 04, 2022 at 08:20:08AM +0530, Amit Kapila wrote:
> > > > > > How about a comment like: "It has to be kept at 8-byte alignment
> > > > > > boundary so as to be accessed directly via C struct as it uses
> > > > > > TYPALIGN_DOUBLE for storage which has 4-byte alignment on platforms
> > > > > > like AIX."? Can you please suggest a better comment if you don't like
> > > > > > this one?
> > > > >
> > > > > I'd write it like this, though I'm not sure it's an improvement on your words:
> > > > >
> > > > >   When ALIGNOF_DOUBLE==4 (e.g. AIX), the C ABI may impose 8-byte alignment on
> > > > >   some of the C types that correspond to TYPALIGN_DOUBLE SQL types.  To ensure
> > > > >   catalog C struct layout matches catalog tuple layout, arrange for the tuple
> > > > >   offset of each fixed-width, attalign='d' catalog column to be divisible by 8
> > > > >   unconditionally.  Keep such columns before the first NameData column of the
> > > > >   catalog, since packagers can override NAMEDATALEN to an odd number.
> > > >
> > > > Thanks!
> > > >
> > > > >
> > > > > The best place for such a comment would be in one of
> > > > > src/test/regress/sql/*sanity*.sql, next to a test written to detect new
> > > > > violations.
> > > >
> > > > Agreed.
> > > >
> > > > IIUC in the new test, we would need a new SQL function to calculate
> > > > the offset of catalog columns including padding, is that right? Or do
> > > > you have an idea to do that by using existing functionality?
> > >
> > > Something like this:
> > >
> > > select
> > >   attrelid::regclass,
> > >   attname,
> > >   array(select typname
> > >         from pg_type t join pg_attribute pa on t.oid = pa.atttypid
> > >         where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum order by pa.attnum) AS
types_before,
> > >   (select sum(attlen)
> > >    from pg_type t join pg_attribute pa on t.oid = pa.atttypid
> > >    where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum) AS len_before
> > > from pg_attribute a
> > > join pg_class c on c.oid = attrelid
> > > where attalign = 'd' and relkind = 'r' and attnotnull and attlen <> -1
> > > order by attrelid::regclass::text, attnum;
> > >     attrelid     │   attname    │                types_before                 │ len_before
> > > ─────────────────┼──────────────┼─────────────────────────────────────────────┼────────────
> > >  pg_sequence     │ seqstart     │ {oid,oid}                                   │          8
> > >  pg_sequence     │ seqincrement │ {oid,oid,int8}                              │         16
> > >  pg_sequence     │ seqmax       │ {oid,oid,int8,int8}                         │         24
> > >  pg_sequence     │ seqmin       │ {oid,oid,int8,int8,int8}                    │         32
> > >  pg_sequence     │ seqcache     │ {oid,oid,int8,int8,int8,int8}               │         40
> > >  pg_subscription │ subskiplsn   │ {oid,oid,name,oid,bool,bool,bool,char,bool} │         81
> > > (6 rows)
> > >
> > > That doesn't count padding, but hazardous column changes will cause a diff in
> > > the output.
> >
> > Yes, in this case, we can detect the violated column order even
> > without considering padding. On the other hand, I think this
> > calculation could not detect some patterns of order. For instance,
> > suppose the column order is {oid, bool, bool, oid, bool, bool, oid,
> > int8}, the len_before is 16 but offset of int8 column including
> > padding is 20 on ALIGNOF_DOUBLE==4 environment.
>
> Correct.  Feel free to make it more precise.  If you do want to add a
> function, it could be a regress.c function rather than an always-installed
> part of PostgreSQL.  Again, getting the buildfarm green is a priority; we can
> always add tests later.

Agreed. I'll update and submit the patch as soon as possible.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 апреля 2022 г., 09:05:10

On Tue, Apr 5, 2022 at 12:38 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Apr 5, 2022 at 10:46 AM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Tue, Apr 05, 2022 at 10:13:06AM +0900, Masahiko Sawada wrote:
> > > On Tue, Apr 5, 2022 at 9:21 AM Noah Misch <noah@leadboat.com> wrote:
> > > > On Mon, Apr 04, 2022 at 06:55:45PM +0900, Masahiko Sawada wrote:
> > > > > On Mon, Apr 4, 2022 at 3:26 PM Noah Misch <noah@leadboat.com> wrote:
> > > > > > On Mon, Apr 04, 2022 at 08:20:08AM +0530, Amit Kapila wrote:
> > > > > > > How about a comment like: "It has to be kept at 8-byte alignment
> > > > > > > boundary so as to be accessed directly via C struct as it uses
> > > > > > > TYPALIGN_DOUBLE for storage which has 4-byte alignment on platforms
> > > > > > > like AIX."? Can you please suggest a better comment if you don't like
> > > > > > > this one?
> > > > > >
> > > > > > I'd write it like this, though I'm not sure it's an improvement on your words:
> > > > > >
> > > > > >   When ALIGNOF_DOUBLE==4 (e.g. AIX), the C ABI may impose 8-byte alignment on
> > > > > >   some of the C types that correspond to TYPALIGN_DOUBLE SQL types.  To ensure
> > > > > >   catalog C struct layout matches catalog tuple layout, arrange for the tuple
> > > > > >   offset of each fixed-width, attalign='d' catalog column to be divisible by 8
> > > > > >   unconditionally.  Keep such columns before the first NameData column of the
> > > > > >   catalog, since packagers can override NAMEDATALEN to an odd number.
> > > > >
> > > > > Thanks!
> > > > >
> > > > > >
> > > > > > The best place for such a comment would be in one of
> > > > > > src/test/regress/sql/*sanity*.sql, next to a test written to detect new
> > > > > > violations.
> > > > >
> > > > > Agreed.
> > > > >
> > > > > IIUC in the new test, we would need a new SQL function to calculate
> > > > > the offset of catalog columns including padding, is that right? Or do
> > > > > you have an idea to do that by using existing functionality?
> > > >
> > > > Something like this:
> > > >
> > > > select
> > > >   attrelid::regclass,
> > > >   attname,
> > > >   array(select typname
> > > >         from pg_type t join pg_attribute pa on t.oid = pa.atttypid
> > > >         where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum order by pa.attnum) AS
types_before,
> > > >   (select sum(attlen)
> > > >    from pg_type t join pg_attribute pa on t.oid = pa.atttypid
> > > >    where pa.attrelid = a.attrelid and pa.attnum > 0 and pa.attnum < a.attnum) AS len_before
> > > > from pg_attribute a
> > > > join pg_class c on c.oid = attrelid
> > > > where attalign = 'd' and relkind = 'r' and attnotnull and attlen <> -1
> > > > order by attrelid::regclass::text, attnum;
> > > >     attrelid     │   attname    │                types_before                 │ len_before
> > > > ─────────────────┼──────────────┼─────────────────────────────────────────────┼────────────
> > > >  pg_sequence     │ seqstart     │ {oid,oid}                                   │          8
> > > >  pg_sequence     │ seqincrement │ {oid,oid,int8}                              │         16
> > > >  pg_sequence     │ seqmax       │ {oid,oid,int8,int8}                         │         24
> > > >  pg_sequence     │ seqmin       │ {oid,oid,int8,int8,int8}                    │         32
> > > >  pg_sequence     │ seqcache     │ {oid,oid,int8,int8,int8,int8}               │         40
> > > >  pg_subscription │ subskiplsn   │ {oid,oid,name,oid,bool,bool,bool,char,bool} │         81
> > > > (6 rows)
> > > >
> > > > That doesn't count padding, but hazardous column changes will cause a diff in
> > > > the output.
> > >
> > > Yes, in this case, we can detect the violated column order even
> > > without considering padding. On the other hand, I think this
> > > calculation could not detect some patterns of order. For instance,
> > > suppose the column order is {oid, bool, bool, oid, bool, bool, oid,
> > > int8}, the len_before is 16 but offset of int8 column including
> > > padding is 20 on ALIGNOF_DOUBLE==4 environment.
> >
> > Correct.  Feel free to make it more precise.  If you do want to add a
> > function, it could be a regress.c function rather than an always-installed
> > part of PostgreSQL.  Again, getting the buildfarm green is a priority; we can
> > always add tests later.
>
> Agreed. I'll update and submit the patch as soon as possible.
>

I've attached an updated patch. The patch includes a regression test
to detect the new violation as we discussed. I've confirmed that
Cirrus CI tests pass. Please confirm on AIX and review the patch.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

make_subskiplsn_aligned_v2.patch

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

05 апреля 2022 г., 10:08:16

On Tue, Apr 05, 2022 at 03:05:10PM +0900, Masahiko Sawada wrote:
> I've attached an updated patch. The patch includes a regression test
> to detect the new violation as we discussed. I've confirmed that
> Cirrus CI tests pass. Please confirm on AIX and review the patch.

When the context of a "git grep skiplsn" match involves several struct fields
in struct order, please change to the new order.  In other words, do for all
"git grep skiplsn" matches what the v2 patch does in GetSubscription().  The
v2 patch does not do this for catalogs.sgml, but it ought to.  I didn't check
all the other "git grep" matches; please do so.

The changes present in this patch all look good.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

05 апреля 2022 г., 10:41:28

On Tue, Apr 5, 2022 at 4:08 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Tue, Apr 05, 2022 at 03:05:10PM +0900, Masahiko Sawada wrote:
> > I've attached an updated patch. The patch includes a regression test
> > to detect the new violation as we discussed. I've confirmed that
> > Cirrus CI tests pass. Please confirm on AIX and review the patch.
>
> When the context of a "git grep skiplsn" match involves several struct fields
> in struct order, please change to the new order.  In other words, do for all
> "git grep skiplsn" matches what the v2 patch does in GetSubscription().  The
> v2 patch does not do this for catalogs.sgml, but it ought to.  I didn't check
> all the other "git grep" matches; please do so.

Oops, I missed many places. I checked all "git grep" matches and fixed them.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

make_subskiplsn_aligned_v3.patch

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

06 апреля 2022 г., 06:21:00

On Tue, Apr 05, 2022 at 04:41:28PM +0900, Masahiko Sawada wrote:
> On Tue, Apr 5, 2022 at 4:08 PM Noah Misch <noah@leadboat.com> wrote:
> > On Tue, Apr 05, 2022 at 03:05:10PM +0900, Masahiko Sawada wrote:
> > > I've attached an updated patch. The patch includes a regression test
> > > to detect the new violation as we discussed. I've confirmed that
> > > Cirrus CI tests pass. Please confirm on AIX and review the patch.
> >
> > When the context of a "git grep skiplsn" match involves several struct fields
> > in struct order, please change to the new order.  In other words, do for all
> > "git grep skiplsn" matches what the v2 patch does in GetSubscription().  The
> > v2 patch does not do this for catalogs.sgml, but it ought to.  I didn't check
> > all the other "git grep" matches; please do so.
> 
> Oops, I missed many places. I checked all "git grep" matches and fixed them.

> --- a/src/backend/catalog/system_views.sql
> +++ b/src/backend/catalog/system_views.sql
> @@ -1285,8 +1285,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
>  
>  -- All columns of pg_subscription except subconninfo are publicly readable.
>  REVOKE ALL ON pg_subscription FROM public;
> -GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
> -              substream, subtwophasestate, subdisableonerr, subskiplsn, subslotname,
> +GRANT SELECT (oid, subdbid, subname, subskiplsn, subowner, subenabled,
> +              subbinary, substream, subtwophasestate, subdisableonerr, subslotname,
>                subsynccommit, subpublications)

subskiplsn comes before subname.  Other than that, this looks done.  I
recommend committing it with that change.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

06 апреля 2022 г., 06:54:42

On Wed, Apr 6, 2022 at 12:21 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Tue, Apr 05, 2022 at 04:41:28PM +0900, Masahiko Sawada wrote:
> > On Tue, Apr 5, 2022 at 4:08 PM Noah Misch <noah@leadboat.com> wrote:
> > > On Tue, Apr 05, 2022 at 03:05:10PM +0900, Masahiko Sawada wrote:
> > > > I've attached an updated patch. The patch includes a regression test
> > > > to detect the new violation as we discussed. I've confirmed that
> > > > Cirrus CI tests pass. Please confirm on AIX and review the patch.
> > >
> > > When the context of a "git grep skiplsn" match involves several struct fields
> > > in struct order, please change to the new order.  In other words, do for all
> > > "git grep skiplsn" matches what the v2 patch does in GetSubscription().  The
> > > v2 patch does not do this for catalogs.sgml, but it ought to.  I didn't check
> > > all the other "git grep" matches; please do so.
> >
> > Oops, I missed many places. I checked all "git grep" matches and fixed them.
>
> > --- a/src/backend/catalog/system_views.sql
> > +++ b/src/backend/catalog/system_views.sql
> > @@ -1285,8 +1285,8 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
> >
> >  -- All columns of pg_subscription except subconninfo are publicly readable.
> >  REVOKE ALL ON pg_subscription FROM public;
> > -GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary,
> > -              substream, subtwophasestate, subdisableonerr, subskiplsn, subslotname,
> > +GRANT SELECT (oid, subdbid, subname, subskiplsn, subowner, subenabled,
> > +              subbinary, substream, subtwophasestate, subdisableonerr, subslotname,
> >                subsynccommit, subpublications)
>
> subskiplsn comes before subname.

Right. I've attached an updated patch.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Вложения

make_subskiplsn_aligned_v4.patch

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

06 апреля 2022 г., 07:31:43

On Wed, Apr 6, 2022 at 9:25 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 6, 2022 at 12:21 PM Noah Misch <noah@leadboat.com> wrote:
>
> Right. I've attached an updated patch.
>

Thanks, this looks good to me as well. Noah, would you like to commit it?

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

06 апреля 2022 г., 11:53:28

On 02.04.22 10:13, Noah Misch wrote:
> uint64 and pg_lsn use TYPALIGN_DOUBLE.  For AIX, they really need a typalign
> corresponding to ALIGNOF_LONG.  Hence, the C struct layout doesn't match the
> tuple layout.  Columns potentially affected:
> 
> [local] test=*# select attrelid::regclass, attname from pg_attribute a join pg_class c on c.oid = attrelid where
attalign= 'd' and relkind = 'r' and attnotnull and attlen <> -1;
 
>      attrelid     │   attname
> ─────────────────┼──────────────
>   pg_sequence     │ seqstart
>   pg_sequence     │ seqincrement
>   pg_sequence     │ seqmax
>   pg_sequence     │ seqmin
>   pg_sequence     │ seqcache
>   pg_subscription │ subskiplsn
> (6 rows)
> 
> The pg_sequence fields evade trouble, because there's exactly eight bytes (two
> oids) before them.

Yes, we carefully did this when we ran into this the last time.  See 

<https://www.postgresql.org/message-id/flat/76ce2ca3-40f2-d291-eae2-17b599f29ba0%402ndquadrant.com#cf1313adff98e1d5e1ca789497898310>

and commit f3b421da5f4addc95812b9db05a24972b8fd9739.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

07 апреля 2022 г., 05:55:45

On Wed, Apr 6, 2022 at 10:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 6, 2022 at 9:25 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Apr 6, 2022 at 12:21 PM Noah Misch <noah@leadboat.com> wrote:
> >
> > Right. I've attached an updated patch.
> >
>
> Thanks, this looks good to me as well. Noah, would you like to commit it?
>

I'll take care of this today. I think we can mark the new function
get_column_offset() being introduced by this patch as parallel safe.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Amit Kapila

Дата:

07 апреля 2022 г., 13:27:54

On Thu, Apr 7, 2022 at 8:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I'll take care of this today. I think we can mark the new function
> get_column_offset() being introduced by this patch as parallel safe.
>

Pushed.

-- 
With Regards,
Amit Kapila.

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

07 апреля 2022 г., 14:39:58

On Thu, Apr 7, 2022 at 7:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 7, 2022 at 8:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > I'll take care of this today. I think we can mark the new function
> > get_column_offset() being introduced by this patch as parallel safe.
> >
>
> Pushed.

Thanks!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

15 апреля 2022 г., 10:26:01

On Thu, Apr 07, 2022 at 08:39:58PM +0900, Masahiko Sawada wrote:
> On Thu, Apr 7, 2022 at 7:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Thu, Apr 7, 2022 at 8:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > I'll take care of this today. I think we can mark the new function
> > > get_column_offset() being introduced by this patch as parallel safe.
> >
> > Pushed.
> 
> Thanks!

I took a closer look at the test case.  The "get_column_offset(coltypes) % 8"
part would have caught the problem only when run on an ALIGNOF_DOUBLE==4
platform.  Instead of testing the start of the typalign='d' column, let's test
the first offset beyond the previous column.  The difference between those two
values depends on ALIGNOF_DOUBLE.  While there, ignore typbyval; it doesn't
affect disk tuple layout, so this test shouldn't care.  I plan to push the
attached patch.

Вложения

sanity_check-skiplsn-v1.patch

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 апреля 2022 г., 04:45:50

On Fri, Apr 15, 2022 at 4:26 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Thu, Apr 07, 2022 at 08:39:58PM +0900, Masahiko Sawada wrote:
> > On Thu, Apr 7, 2022 at 7:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > On Thu, Apr 7, 2022 at 8:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > I'll take care of this today. I think we can mark the new function
> > > > get_column_offset() being introduced by this patch as parallel safe.
> > >
> > > Pushed.
> >
> > Thanks!
>
> I took a closer look at the test case.  The "get_column_offset(coltypes) % 8"
> part would have caught the problem only when run on an ALIGNOF_DOUBLE==4
> platform.  Instead of testing the start of the typalign='d' column, let's test
> the first offset beyond the previous column.  The difference between those two
> values depends on ALIGNOF_DOUBLE.

Yes, but it could be false positives in some cases. For instance, the
column {oid, bool, XLogRecPtr} should be okay on ALIGNOF_DOUBLE == 4
and 8 platforms but the new test fails.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

18 апреля 2022 г., 06:22:24

On Mon, Apr 18, 2022 at 10:45:50AM +0900, Masahiko Sawada wrote:
> On Fri, Apr 15, 2022 at 4:26 PM Noah Misch <noah@leadboat.com> wrote:
> > On Thu, Apr 07, 2022 at 08:39:58PM +0900, Masahiko Sawada wrote:
> > > On Thu, Apr 7, 2022 at 7:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > On Thu, Apr 7, 2022 at 8:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > I'll take care of this today. I think we can mark the new function
> > > > > get_column_offset() being introduced by this patch as parallel safe.
> > > >
> > > > Pushed.
> > >
> > > Thanks!
> >
> > I took a closer look at the test case.  The "get_column_offset(coltypes) % 8"
> > part would have caught the problem only when run on an ALIGNOF_DOUBLE==4
> > platform.  Instead of testing the start of the typalign='d' column, let's test
> > the first offset beyond the previous column.  The difference between those two
> > values depends on ALIGNOF_DOUBLE.
> 
> Yes, but it could be false positives in some cases. For instance, the
> column {oid, bool, XLogRecPtr} should be okay on ALIGNOF_DOUBLE == 4
> and 8 platforms but the new test fails.

I'm happy with that, because the affected author should look for padding-free
layouts before settling on your example layout.  If the padding-free layouts
are all unacceptable, the author should update the expected sanity_check.out
to show the one row where the test "fails".

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

18 апреля 2022 г., 07:32:37

On Mon, Apr 18, 2022 at 12:22 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Mon, Apr 18, 2022 at 10:45:50AM +0900, Masahiko Sawada wrote:
> > On Fri, Apr 15, 2022 at 4:26 PM Noah Misch <noah@leadboat.com> wrote:
> > > On Thu, Apr 07, 2022 at 08:39:58PM +0900, Masahiko Sawada wrote:
> > > > On Thu, Apr 7, 2022 at 7:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > On Thu, Apr 7, 2022 at 8:25 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > I'll take care of this today. I think we can mark the new function
> > > > > > get_column_offset() being introduced by this patch as parallel safe.
> > > > >
> > > > > Pushed.
> > > >
> > > > Thanks!
> > >
> > > I took a closer look at the test case.  The "get_column_offset(coltypes) % 8"
> > > part would have caught the problem only when run on an ALIGNOF_DOUBLE==4
> > > platform.  Instead of testing the start of the typalign='d' column, let's test
> > > the first offset beyond the previous column.  The difference between those two
> > > values depends on ALIGNOF_DOUBLE.
> >
> > Yes, but it could be false positives in some cases. For instance, the
> > column {oid, bool, XLogRecPtr} should be okay on ALIGNOF_DOUBLE == 4
> > and 8 platforms but the new test fails.
>
> I'm happy with that, because the affected author should look for padding-free
> layouts before settling on your example layout.  If the padding-free layouts
> are all unacceptable, the author should update the expected sanity_check.out
> to show the one row where the test "fails".

That makes sense.

Regard,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Robert Haas

Дата:

13 июня 2022 г., 17:25:24

On Sun, Apr 17, 2022 at 11:22 PM Noah Misch <noah@leadboat.com> wrote:
> > Yes, but it could be false positives in some cases. For instance, the
> > column {oid, bool, XLogRecPtr} should be okay on ALIGNOF_DOUBLE == 4
> > and 8 platforms but the new test fails.
>
> I'm happy with that, because the affected author should look for padding-free
> layouts before settling on your example layout.  If the padding-free layouts
> are all unacceptable, the author should update the expected sanity_check.out
> to show the one row where the test "fails".

I realize that it was necessary to get something committed quickly
here to unbreak the buildfarm, but this is really a mess. As I
understand it, the problem here is that typalign='d' is either 4 bytes
or 8 depending on how the 'double' type is aligned on that platform,
but we use that typalign value also for some other data types that may
not be aligned in the same way as 'double'. Consequently, it's
possible to have a situation where the behavior of the C compiler
diverges from the behavior of heap_form_tuple(). To avoid that, we
need every catalog column that uses typalign=='d' to begin on an
8-byte boundary. We also want all such columns to occur before the
first NameData column in the catalog, to guard against the possibility
that NAMEDATALEN has been redefined to an odd value. I think this set
of constraints is a nuisance and that it's mostly good luck we haven't
run into any really awkward problems here so far.

In many of our catalogs, the first member is an OID and the second
member of the struct is of type NameData: pg_namespace, pg_class,
pg_proc, etc. That common design pattern is in direct contradiction to
the desires of this test case. As soon as someone wants to add a
typalign='d' member to any of those system catalogs, the struct layout
is going to have to get shuffled around -- and then it will look
different from all the other ones. Or else we'd have to rearrange them
all to move all the NameData columns to the end. I feel like it's
weird to introduce a test case that so obviously flies in the face of
how catalog layout has been done up to this point, especially for the
sake of a hypothetical user who want to set NAMEDATALEN to an odd
number. I doubt such scenarios have been thoroughly tested, or ever
will be. Perhaps instead we ought to legislate that NAMEDATALEN must
be a multiple of 8, or some such thing.

The other constraint, that typalign='d' fields must always fall on an
8 byte boundary, is probably less annoying in practice, but it's easy
to imagine a future catalog running into trouble. Let's say we want to
introduce a new catalog that has only an Oid column and a float8
column. Perhaps with 0-3 bool or uint8 columns as well, or with any
number of NameData columns as well. Well, the only way to satisfy this
constraint is to put the float8 column first and the Oid column after
it, which immediately makes it look different from every other catalog
we have. It's hard to feel like that would be a good solution here. I
think we ought to try to engineer a solution where heap_form_tuple()
is going to do the same thing as the C compiler without the sorts of
extra rules that this test case enforces.

AFAICS, we could do that by:

1. De-supporting platforms that have this problem, or
2. Introducing new typalign values, as Noah proposed back on April 2, or
3. Somehow forcing values that are sometimes 4-byte aligned and
sometimes 8-byte aligned to be 8-byte alignment on all platforms

I also don't like the fact that the test case doesn't even catch
exactly the problematic set of cases, but rather a superset, leaving
it up to future patch authors to make a correct judgment about whether
a certain new column can be listed as an expected output of the test
case or whether the catalog representation must be changed. The idea
that we'll reliably get that right might be optimistic. Again, I don't
mean to say that this is the fault of this test case since, without
the test case, we'd have no idea that there was even a potential
problem, which would not be better. But it feels to me like we're
hacking around the real problem instead of fixing it, and it seems to
me that we should try to do better.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

14 июня 2022 г., 10:53:42

On Mon, Jun 13, 2022 at 11:25 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sun, Apr 17, 2022 at 11:22 PM Noah Misch <noah@leadboat.com> wrote:
> > > Yes, but it could be false positives in some cases. For instance, the
> > > column {oid, bool, XLogRecPtr} should be okay on ALIGNOF_DOUBLE == 4
> > > and 8 platforms but the new test fails.
> >
> > I'm happy with that, because the affected author should look for padding-free
> > layouts before settling on your example layout.  If the padding-free layouts
> > are all unacceptable, the author should update the expected sanity_check.out
> > to show the one row where the test "fails".
>
> I realize that it was necessary to get something committed quickly
> here to unbreak the buildfarm, but this is really a mess. As I
> understand it, the problem here is that typalign='d' is either 4 bytes
> or 8 depending on how the 'double' type is aligned on that platform,
> but we use that typalign value also for some other data types that may
> not be aligned in the same way as 'double'. Consequently, it's
> possible to have a situation where the behavior of the C compiler
> diverges from the behavior of heap_form_tuple(). To avoid that, we
> need every catalog column that uses typalign=='d' to begin on an
> 8-byte boundary. We also want all such columns to occur before the
> first NameData column in the catalog, to guard against the possibility
> that NAMEDATALEN has been redefined to an odd value. I think this set
> of constraints is a nuisance and that it's mostly good luck we haven't
> run into any really awkward problems here so far.
>
> In many of our catalogs, the first member is an OID and the second
> member of the struct is of type NameData: pg_namespace, pg_class,
> pg_proc, etc. That common design pattern is in direct contradiction to
> the desires of this test case. As soon as someone wants to add a
> typalign='d' member to any of those system catalogs, the struct layout
> is going to have to get shuffled around -- and then it will look
> different from all the other ones. Or else we'd have to rearrange them
> all to move all the NameData columns to the end. I feel like it's
> weird to introduce a test case that so obviously flies in the face of
> how catalog layout has been done up to this point, especially for the
> sake of a hypothetical user who want to set NAMEDATALEN to an odd
> number. I doubt such scenarios have been thoroughly tested, or ever
> will be. Perhaps instead we ought to legislate that NAMEDATALEN must
> be a multiple of 8, or some such thing.
>
> The other constraint, that typalign='d' fields must always fall on an
> 8 byte boundary, is probably less annoying in practice, but it's easy
> to imagine a future catalog running into trouble. Let's say we want to
> introduce a new catalog that has only an Oid column and a float8
> column. Perhaps with 0-3 bool or uint8 columns as well, or with any
> number of NameData columns as well. Well, the only way to satisfy this
> constraint is to put the float8 column first and the Oid column after
> it, which immediately makes it look different from every other catalog
> we have. It's hard to feel like that would be a good solution here. I
> think we ought to try to engineer a solution where heap_form_tuple()
> is going to do the same thing as the C compiler without the sorts of
> extra rules that this test case enforces.

These seem to be valid concerns.

> AFAICS, we could do that by:
>
> 1. De-supporting platforms that have this problem, or
> 2. Introducing new typalign values, as Noah proposed back on April 2, or
> 3. Somehow forcing values that are sometimes 4-byte aligned and
> sometimes 8-byte aligned to be 8-byte alignment on all platforms

Introducing new typalign values seems a good idea to me as it's more
future-proof. Will this item be for PG16, right? The main concern
seems that what this test case enforces would be nuisance when
introducing a new system catalog or a new column to the existing
catalog but given we're in post PG15-beta1 it is unlikely to happen in
PG15.


Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Robert Haas

Дата:

15 июня 2022 г., 20:27:06

On Tue, Jun 14, 2022 at 3:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > AFAICS, we could do that by:
> >
> > 1. De-supporting platforms that have this problem, or
> > 2. Introducing new typalign values, as Noah proposed back on April 2, or
> > 3. Somehow forcing values that are sometimes 4-byte aligned and
> > sometimes 8-byte aligned to be 8-byte alignment on all platforms
>
> Introducing new typalign values seems a good idea to me as it's more
> future-proof. Will this item be for PG16, right? The main concern
> seems that what this test case enforces would be nuisance when
> introducing a new system catalog or a new column to the existing
> catalog but given we're in post PG15-beta1 it is unlikely to happen in
> PG15.

I agree that we're not likely to introduce a new typalign value any
sooner than v16. There are a couple of things that bother me about
that solution. One is that I don't know how many different behaviors
exist out there in the wild. If we distinguish the alignment of double
from the alignment of int8, is that good enough, or are there other
data types whose properties aren't necessarily the same as either of
those? The other is that 32-bit systems are already relatively rare
and probably will become more rare until they disappear completely. It
doesn't seem like a ton of fun to engineer solutions to problems that
may go away by themselves with the passage of time. On the other hand,
if the alternative is to live with this kind of ugliness for another 5
years, maybe the time it takes to craft a solution is effort well
spent.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Masahiko Sawada

Дата:

16 июня 2022 г., 10:25:36

On Thu, Jun 16, 2022 at 2:27 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Jun 14, 2022 at 3:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > AFAICS, we could do that by:
> > >
> > > 1. De-supporting platforms that have this problem, or
> > > 2. Introducing new typalign values, as Noah proposed back on April 2, or
> > > 3. Somehow forcing values that are sometimes 4-byte aligned and
> > > sometimes 8-byte aligned to be 8-byte alignment on all platforms
> >
> > Introducing new typalign values seems a good idea to me as it's more
> > future-proof. Will this item be for PG16, right? The main concern
> > seems that what this test case enforces would be nuisance when
> > introducing a new system catalog or a new column to the existing
> > catalog but given we're in post PG15-beta1 it is unlikely to happen in
> > PG15.
>
> I agree that we're not likely to introduce a new typalign value any
> sooner than v16. There are a couple of things that bother me about
> that solution. One is that I don't know how many different behaviors
> exist out there in the wild. If we distinguish the alignment of double
> from the alignment of int8, is that good enough, or are there other
> data types whose properties aren't necessarily the same as either of
> those?

Yeah, there might be.

> The other is that 32-bit systems are already relatively rare
> and probably will become more rare until they disappear completely. It
> doesn't seem like a ton of fun to engineer solutions to problems that
> may go away by themselves with the passage of time.

IIUC the system affected by this problem is not necessarily 32-bit
system. For instance, the hoverfly on buildfarm is 64-bit system but
was affected by this problem. According to the XLC manual[1], there is
no difference between 32-bit systems and 64-bit systems in terms of
alignment for double. FWIW, looking at the manual, there might have
been a solution for AIX to specify -qalign=natural compiler option in
order to enforce the alignment of double to 8.

Regards,

[1] https://support.scinet.utoronto.ca/Manuals/xlC++-proguide.pdf;
Table 11 on page 10.

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: Skipping logical replication transactions on subscriber side

От

Robert Haas

Дата:

16 июня 2022 г., 19:35:43

On Thu, Jun 16, 2022 at 3:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> FWIW, looking at the manual, there might have
> been a solution for AIX to specify -qalign=natural compiler option in
> order to enforce the alignment of double to 8.

Well if that can work it sure seems better.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Peter Eisentraut

Дата:

20 июня 2022 г., 16:52:49

On 16.06.22 18:35, Robert Haas wrote:
> On Thu, Jun 16, 2022 at 3:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> FWIW, looking at the manual, there might have
>> been a solution for AIX to specify -qalign=natural compiler option in
>> order to enforce the alignment of double to 8.
> 
> Well if that can work it sure seems better.

That means changing the system's ABI, so in the extreme case you then 
need to compile everything else to match as well.

Re: Skipping logical replication transactions on subscriber side

От

Robert Haas

Дата:

20 июня 2022 г., 17:04:06

On Mon, Jun 20, 2022 at 9:52 AM Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> That means changing the system's ABI, so in the extreme case you then
> need to compile everything else to match as well.

I think we wouldn't want to do that in a minor release, but doing it
in a new major release seems fine -- especially if only AIX is
affected.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

22 июня 2022 г., 07:28:14

On Mon, Jun 13, 2022 at 10:25:24AM -0400, Robert Haas wrote:
> On Sun, Apr 17, 2022 at 11:22 PM Noah Misch <noah@leadboat.com> wrote:
> > > Yes, but it could be false positives in some cases. For instance, the
> > > column {oid, bool, XLogRecPtr} should be okay on ALIGNOF_DOUBLE == 4
> > > and 8 platforms but the new test fails.
> >
> > I'm happy with that, because the affected author should look for padding-free
> > layouts before settling on your example layout.  If the padding-free layouts
> > are all unacceptable, the author should update the expected sanity_check.out
> > to show the one row where the test "fails".

> Perhaps instead we ought to legislate that NAMEDATALEN must
> be a multiple of 8, or some such thing.
> 
> The other constraint, that typalign='d' fields must always fall on an
> 8 byte boundary, is probably less annoying in practice, but it's easy
> to imagine a future catalog running into trouble. Let's say we want to
> introduce a new catalog that has only an Oid column and a float8
> column. Perhaps with 0-3 bool or uint8 columns as well, or with any
> number of NameData columns as well. Well, the only way to satisfy this
> constraint is to put the float8 column first and the Oid column after
> it, which immediately makes it look different from every other catalog
> we have.

> AFAICS, we could do that by:
> 
> 1. De-supporting platforms that have this problem, or
> 2. Introducing new typalign values, as Noah proposed back on April 2, or
> 3. Somehow forcing values that are sometimes 4-byte aligned and
> sometimes 8-byte aligned to be 8-byte alignment on all platforms

On Mon, Jun 20, 2022 at 10:04:06AM -0400, Robert Haas wrote:
> On Mon, Jun 20, 2022 at 9:52 AM Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> > That means changing the system's ABI, so in the extreme case you then
> > need to compile everything else to match as well.
> 
> I think we wouldn't want to do that in a minor release, but doing it
> in a new major release seems fine -- especially if only AIX is
> affected.

"Everything" isn't limited to PostgreSQL.  The Perl ABI exposes large structs
to plperl; a field of type double could require the AIX user to rebuild Perl
with the same compiler option.

Overall, this could be a textbook example of choosing between:

- Mild harm (unaesthetic column order) to many people.
- Considerable harm (dump/reload instead of pg_upgrade) to a small, unknown,
  possibly-zero quantity of people.

Here's how I rank the options, from most-preferred to least-preferred:

1. Put new eight-byte fields at the front of each catalog, when in doubt.
2. On systems where double alignment differs from int64 alignment, require
   NAMEDATALEN%8==0.  Upgrading to v16 would require dump/reload for AIX users
   changing NAMEDATALEN to conform to the new restriction.
3. Introduce new typalign values.  Upgrading to v16 would require dump/reload
   for all AIX users.
4. De-support AIX.
5. From above, "Somehow forcing values that are sometimes 4-byte aligned and
   sometimes 8-byte aligned to be 8-byte alignment on all platforms".
   Upgrading to v16 would require dump/reload for all AIX users.
6. Require -qalign=natural on AIX.  Upgrading to v16 would require dump/reload
   and possible system library rebuilds for all AIX users.

I gather (1) isn't at the top of your ranking, or you wouldn't have written
in.  What do you think of (2)?

Re: Skipping logical replication transactions on subscriber side

От

Robert Haas

Дата:

22 июня 2022 г., 16:50:02

On Wed, Jun 22, 2022 at 12:28 AM Noah Misch <noah@leadboat.com> wrote:
> "Everything" isn't limited to PostgreSQL.  The Perl ABI exposes large structs
> to plperl; a field of type double could require the AIX user to rebuild Perl
> with the same compiler option.

Oh, that isn't so great, then.

> Here's how I rank the options, from most-preferred to least-preferred:
>
> 1. Put new eight-byte fields at the front of each catalog, when in doubt.
> 2. On systems where double alignment differs from int64 alignment, require
>    NAMEDATALEN%8==0.  Upgrading to v16 would require dump/reload for AIX users
>    changing NAMEDATALEN to conform to the new restriction.
> 3. Introduce new typalign values.  Upgrading to v16 would require dump/reload
>    for all AIX users.
> 4. De-support AIX.
> 5. From above, "Somehow forcing values that are sometimes 4-byte aligned and
>    sometimes 8-byte aligned to be 8-byte alignment on all platforms".
>    Upgrading to v16 would require dump/reload for all AIX users.
> 6. Require -qalign=natural on AIX.  Upgrading to v16 would require dump/reload
>    and possible system library rebuilds for all AIX users.
>
> I gather (1) isn't at the top of your ranking, or you wouldn't have written
> in.  What do you think of (2)?

(2) pleases me in the sense that it seems to inconvenience very few
people, perhaps no one, in order to avoid inconveniencing a larger
number of people. However, it doesn't seem sufficient. If I understand
correctly, even a catalog that includes no NameData column can have a
problem.

Regarding (1), it is my opinion that the only real value of typalign
is for system catalogs, and specifically that it lets you put the
fields in an order that is aesthetically pleasing rather than worrying
about alignment considerations. After all, if we just ordered the
fields by descending alignment requirement, we could get rid of
typalign altogether (at least, if we didn't care about backward
compatibility). User tables would get smaller because we'd get rid of
alignment padding, and I don't think we'd see much impact on
performance because, for user tables, we copy the values into a datum
array before doing anything interesting with them. So (1) seems to me
to be conceding that typalign is unfit for the only purpose it has.
Perhaps that's just how things are, but it doesn't seem like a good
way for things to be.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Tom Lane

Дата:

22 июня 2022 г., 17:39:20

[ sorry for not having tracked this thread more closely ... ]

Robert Haas <robertmhaas@gmail.com> writes:
> Regarding (1), it is my opinion that the only real value of typalign
> is for system catalogs, and specifically that it lets you put the
> fields in an order that is aesthetically pleasing rather than worrying
> about alignment considerations. After all, if we just ordered the
> fields by descending alignment requirement, we could get rid of
> typalign altogether (at least, if we didn't care about backward
> compatibility). User tables would get smaller because we'd get rid of
> alignment padding, and I don't think we'd see much impact on
> performance because, for user tables, we copy the values into a datum
> array before doing anything interesting with them. So (1) seems to me
> to be conceding that typalign is unfit for the only purpose it has.

That's a fundamental misreading of the situation.  typalign is essential
on alignment-picky architectures, else you will get a SIGBUS fault
when trying to fetch a multibyte value (whether it's just going to get
stored into a Datum array is not very relevant here).

It appears that what we've got on AIX is that typalign 'd' overstates the
actual alignment requirement for 'double', which is safe from the SIGBUS
angle.  However, it is a problem for our usage with system catalogs,
where our C struct declarations may not line up with the way that a
tuple is constructed by the tuple assembly routines.

I concur that Noah's description of #2 is not an accurate statement
of the rules we'd have to impose to be sure that the C structs line up
with the actual tuple layouts.  I don't think we want rules exactly,
what we need is mechanical verification that the field orderings in
use are safe.  The last time I looked at this thread, what was being
discussed was (a) re-ordering pg_subscription's columns and (b)
adding some kind of regression test to verify that all catalogs meet
the expectation of 'd'-aligned fields not needing alignment padding
that an AIX compiler might choose not to insert.  That still seems
like the most plausible answer to me.  I don't especially want to
invent an additional typalign code that we could only test on legacy
platforms.

            regards, tom lane

Re: Skipping logical replication transactions on subscriber side

От

Robert Haas

Дата:

22 июня 2022 г., 17:53:07

On Wed, Jun 22, 2022 at 10:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> That's a fundamental misreading of the situation.  typalign is essential
> on alignment-picky architectures, else you will get a SIGBUS fault
> when trying to fetch a multibyte value (whether it's just going to get
> stored into a Datum array is not very relevant here).

I mean, that problem is easily worked around. Maybe you think memcpy
would be a lot slower than a direct assignment, but "essential" is a
strong word.

> I concur that Noah's description of #2 is not an accurate statement
> of the rules we'd have to impose to be sure that the C structs line up
> with the actual tuple layouts.  I don't think we want rules exactly,
> what we need is mechanical verification that the field orderings in
> use are safe.  The last time I looked at this thread, what was being
> discussed was (a) re-ordering pg_subscription's columns and (b)
> adding some kind of regression test to verify that all catalogs meet
> the expectation of 'd'-aligned fields not needing alignment padding
> that an AIX compiler might choose not to insert.  That still seems
> like the most plausible answer to me.  I don't especially want to
> invent an additional typalign code that we could only test on legacy
> platforms.

I agree with that, but I don't think that having the developers
enforce alignment rules by reordering catalog columns for the sake of
legacy platforms is appealing either.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Tom Lane

Дата:

22 июня 2022 г., 18:01:02

Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, Jun 22, 2022 at 10:39 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I don't especially want to
>> invent an additional typalign code that we could only test on legacy
>> platforms.

> I agree with that, but I don't think that having the developers
> enforce alignment rules by reordering catalog columns for the sake of
> legacy platforms is appealing either.

Given that we haven't run into this before, it seems like a reasonable
bet that the problem will seldom arise.  So as long as we have a
cross-check I'm all right with calling it good and moving on.  Expending
a whole lot of work to improve the situation seems uncalled-for.

When and if we get to a point where we're ready to break on-disk
compatibility for user tables, perhaps revisiting the alignment
rules would be an appropriate component of that.  I don't see that
happening in the foreseeable future, though.

            regards, tom lane

Re: Skipping logical replication transactions on subscriber side

От

Robert Haas

Дата:

22 июня 2022 г., 18:02:50

On Wed, Jun 22, 2022 at 11:01 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Given that we haven't run into this before, it seems like a reasonable
> bet that the problem will seldom arise.  So as long as we have a
> cross-check I'm all right with calling it good and moving on.  Expending
> a whole lot of work to improve the situation seems uncalled-for.

All right. Well, I'm on record as not liking that solution, but
obviously you can and do feel differently.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Skipping logical replication transactions on subscriber side

От

Noah Misch

Дата:

23 июня 2022 г., 05:48:24

On Wed, Jun 22, 2022 at 09:50:02AM -0400, Robert Haas wrote:
> On Wed, Jun 22, 2022 at 12:28 AM Noah Misch <noah@leadboat.com> wrote:
> > Here's how I rank the options, from most-preferred to least-preferred:
> >
> > 1. Put new eight-byte fields at the front of each catalog, when in doubt.
> > 2. On systems where double alignment differs from int64 alignment, require
> >    NAMEDATALEN%8==0.  Upgrading to v16 would require dump/reload for AIX users
> >    changing NAMEDATALEN to conform to the new restriction.
> > 3. Introduce new typalign values.  Upgrading to v16 would require dump/reload
> >    for all AIX users.
> > 4. De-support AIX.
> > 5. From above, "Somehow forcing values that are sometimes 4-byte aligned and
> >    sometimes 8-byte aligned to be 8-byte alignment on all platforms".
> >    Upgrading to v16 would require dump/reload for all AIX users.
> > 6. Require -qalign=natural on AIX.  Upgrading to v16 would require dump/reload
> >    and possible system library rebuilds for all AIX users.
> >
> > I gather (1) isn't at the top of your ranking, or you wouldn't have written
> > in.  What do you think of (2)?
> 
> (2) pleases me in the sense that it seems to inconvenience very few
> people, perhaps no one, in order to avoid inconveniencing a larger
> number of people. However, it doesn't seem sufficient.

Here's a more-verbose description of (2), with additions about what it does
and doesn't achieve:

2. On systems where double alignment differs from int64 alignment, require
   NAMEDATALEN%8==0.  Modify the test from commits 79b716c and c1da0ac to stop
   treating "name" fields specially.  The test will still fail for AIX
   compatibility violations, but "name" columns no longer limit your field
   position candidates like they do today (today == option (1)).  Upgrading to
   v16 would require dump/reload for AIX users changing NAMEDATALEN to conform
   to the new restriction.  (I'm not sure pg_upgrade checks NAMEDATALEN
   compatibility, but it should require at least one of: same NAMEDATALEN, or
   absence of "name" columns in user tables.)

> If I understand
> correctly, even a catalog that includes no NameData column can have a
> problem.

Correct.

On Wed, Jun 22, 2022 at 10:39:20AM -0400, Tom Lane wrote:
> It appears that what we've got on AIX is that typalign 'd' overstates the
> actual alignment requirement for 'double', which is safe from the SIGBUS
> angle.

On AIX, typalign='d' states the exact alignment requirement for 'double'.  It
understates the alignment requirement for int64_t.

> I don't think we want rules exactly, what we need is mechanical verification
> that the field orderings in use are safe.

Commits 79b716c and c1da0ac did that.

Re: Skipping logical replication transactions on subscriber side

От

Robert Haas

Дата:

23 июня 2022 г., 16:58:07

On Wed, Jun 22, 2022 at 10:48 PM Noah Misch <noah@leadboat.com> wrote:
> Here's a more-verbose description of (2), with additions about what it does
> and doesn't achieve:
>
> 2. On systems where double alignment differs from int64 alignment, require
>    NAMEDATALEN%8==0.  Modify the test from commits 79b716c and c1da0ac to stop
>    treating "name" fields specially.  The test will still fail for AIX
>    compatibility violations, but "name" columns no longer limit your field
>    position candidates like they do today (today == option (1)).  Upgrading to
>    v16 would require dump/reload for AIX users changing NAMEDATALEN to conform
>    to the new restriction.  (I'm not sure pg_upgrade checks NAMEDATALEN
>    compatibility, but it should require at least one of: same NAMEDATALEN, or
>    absence of "name" columns in user tables.)

Doing this much seems pretty close to free to me. I doubt anyone
really cares about using a NAMEDATALEN value that is not a multiple of
8 on any platform. I also think there are few people who care about
AIX. The intersection must be very small indeed, or so I would think.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Skipping logical replication transactions on subscriber side

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения