Обсуждение: Logical Replication of sequences

Поиск
Список
Период
Сортировка

Logical Replication of sequences

От
Amit Kapila
Дата:
In the past, we have discussed various approaches to replicate
sequences by decoding the sequence changes from WAL. However, we faced
several challenges to achieve the same, some of which are due to the
non-transactional nature of sequences. The major ones were: (a)
correctness of the decoding part, some of the problems were discussed
at [1][2][3] (b) handling of sequences especially adding certain
sequences automatically (e.g. sequences backing SERIAL/BIGSERIAL
columns) for built-in logical replication is not considered in the
proposed work [1] (c) there were some performance concerns in not so
frequent scenarios [4] (see performance issues), we can probably deal
with this by making sequences optional for builtin logical replication

It could be possible that we can deal with these and any other issues
with more work but as the use case for this feature is primarily major
version upgrades it is not clear that we want to make such a big
change to the code or are there better alternatives to achieve the
same.

This time at pgconf.dev (https://2024.pgconf.dev/), we discussed
alternative approaches for this work which I would like to summarize.
The various methods we discussed are as follows:

1. Provide a tool to copy all the sequences from publisher to
subscriber. The major drawback is that users need to perform this as
an additional step during the upgrade which would be inconvenient and
probably not as useful as some built-in mechanism.
2. Provide a command say Alter Subscription ...  Replicate Sequences
(or something like that) which users can perform before shutdown of
the publisher node during upgrade. This will allow copying all the
sequences from the publisher node to the subscriber node directly.
Similar to previous approach, this could also be inconvenient for
users.
3. Replicate published sequences via walsender at the time of shutdown
or incrementally while decoding checkpoint record. The two ways to
achieve this are: (a) WAL log a special NOOP record just before
shutting down checkpointer. Then allow the WALsender to read the
sequence data and send it to the subscriber while decoding the new
NOOP record. (b) Similar to the previous idea but instead of WAL
logging a new record directly invokes a decoding callback after
walsender receives a request to shutdown which will allow pgoutput to
read and send required sequences. This approach has a drawback that we
are adding more work at the time of shutdown but note that we already
waits for all the WAL records to be decoded and sent before shutting
down the walsender during shutdown of the node.

Any other ideas?

I have added the members I remember that were part of the discussion
in the email. Please feel free to correct me if I have misunderstood
or missed any point we talked about.

Thoughts?

[1] - https://www.postgresql.org/message-id/e4145f77-6f37-40e0-a770-aba359c50b93%40enterprisedb.com
[2] - https://www.postgresql.org/message-id/CAA4eK1Lxt%2B5a9fA-B7FRzfd1vns%3DEwZTF5z9_xO9Ms4wsqD88Q%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CAA4eK1KR4%3DyALKP0pOdVkqUwoUqD_v7oU3HzY-w0R_EBvgHL2w%40mail.gmail.com
[4] - https://www.postgresql.org/message-id/12822961-b7de-9d59-dd27-2e3dc3980c7e%40enterprisedb.com

-- 
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Bharath Rupireddy
Дата:
Hi,

On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> 3. Replicate published sequences via walsender at the time of shutdown
> or incrementally while decoding checkpoint record. The two ways to
> achieve this are: (a) WAL log a special NOOP record just before
> shutting down checkpointer. Then allow the WALsender to read the
> sequence data and send it to the subscriber while decoding the new
> NOOP record. (b) Similar to the previous idea but instead of WAL
> logging a new record directly invokes a decoding callback after
> walsender receives a request to shutdown which will allow pgoutput to
> read and send required sequences. This approach has a drawback that we
> are adding more work at the time of shutdown but note that we already
> waits for all the WAL records to be decoded and sent before shutting
> down the walsender during shutdown of the node.

Thanks. IIUC, both of the above approaches decode the sequences during
only shutdown. I'm wondering, why not periodically decode and
replicate the published sequences so that the decoding at the shutdown
will not take that longer? I can imagine a case where there are tens
of thousands of sequences in a production server, and surely decoding
and sending them just during the shutdown can take a lot of time
hampering the overall server uptime.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Tue, Jun 4, 2024 at 4:53 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > 3. Replicate published sequences via walsender at the time of shutdown
> > or incrementally while decoding checkpoint record. The two ways to
> > achieve this are: (a) WAL log a special NOOP record just before
> > shutting down checkpointer. Then allow the WALsender to read the
> > sequence data and send it to the subscriber while decoding the new
> > NOOP record. (b) Similar to the previous idea but instead of WAL
> > logging a new record directly invokes a decoding callback after
> > walsender receives a request to shutdown which will allow pgoutput to
> > read and send required sequences. This approach has a drawback that we
> > are adding more work at the time of shutdown but note that we already
> > waits for all the WAL records to be decoded and sent before shutting
> > down the walsender during shutdown of the node.
>
> Thanks. IIUC, both of the above approaches decode the sequences during
> only shutdown. I'm wondering, why not periodically decode and
> replicate the published sequences so that the decoding at the shutdown
> will not take that longer?
>

Even if we decode it periodically (say each time we decode the
checkpoint record) then also we need to send the entire set of
sequences at shutdown. This is because the sequences may have changed
from the last time we sent them.

>
> I can imagine a case where there are tens
> of thousands of sequences in a production server, and surely decoding
> and sending them just during the shutdown can take a lot of time
> hampering the overall server uptime.
>

It is possible but we will send only the sequences that belong to
publications for which walsender is supposed to send the required
data. Now, we can also imagine providing option 2 (Alter Subscription
... Replicate Sequences) so that users can replicate sequences before
shutdown and then disable the subscriptions so that there won't be a
corresponding walsender.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Ashutosh Bapat
Дата:


On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

3. Replicate published sequences via walsender at the time of shutdown
or incrementally while decoding checkpoint record. The two ways to
achieve this are: (a) WAL log a special NOOP record just before
shutting down checkpointer. Then allow the WALsender to read the
sequence data and send it to the subscriber while decoding the new
NOOP record. (b) Similar to the previous idea but instead of WAL
logging a new record directly invokes a decoding callback after
walsender receives a request to shutdown which will allow pgoutput to
read and send required sequences. This approach has a drawback that we
are adding more work at the time of shutdown but note that we already
waits for all the WAL records to be decoded and sent before shutting
down the walsender during shutdown of the node.

Any other ideas?


In case of primary crash the sequence won't get replicated. That is true even with the previous approach in case walsender is shut down because of a crash, but it is more serious with this approach. How about periodically sending this information? 

--
Best Wishes,
Ashutosh Bapat

Re: Logical Replication of sequences

От
Yogesh Sharma
Дата:
On 6/4/24 06:57, Amit Kapila wrote:
> 1. Provide a tool to copy all the sequences from publisher to
> subscriber. The major drawback is that users need to perform this as
> an additional step during the upgrade which would be inconvenient and
> probably not as useful as some built-in mechanism.

Agree, this requires additional steps. Not a preferred approach in my
opinion. When a large set of sequences are present, it will add
additional downtime for upgrade process.

> 2. Provide a command say Alter Subscription ...  Replicate Sequences
> (or something like that) which users can perform before shutdown of
> the publisher node during upgrade. This will allow copying all the
> sequences from the publisher node to the subscriber node directly.
> Similar to previous approach, this could also be inconvenient for
> users.

This is similar to option 1 except that it is a SQL command now. Still
not a preferred approach in my opinion. When a large set of sequences are
present, it will add additional downtime for upgrade process.


> 3. Replicate published sequences via walsender at the time of shutdown
> or incrementally while decoding checkpoint record. The two ways to
> achieve this are: (a) WAL log a special NOOP record just before
> shutting down checkpointer. Then allow the WALsender to read the
> sequence data and send it to the subscriber while decoding the new
> NOOP record. (b) Similar to the previous idea but instead of WAL
> logging a new record directly invokes a decoding callback after
> walsender receives a request to shutdown which will allow pgoutput to
> read and send required sequences. This approach has a drawback that we
> are adding more work at the time of shutdown but note that we already
> waits for all the WAL records to be decoded and sent before shutting
> down the walsender during shutdown of the node.

At the time of shutdown a) most logical upgrades don't necessarily call
for shutdown b) it will still add to total downtime with large set of
sequences. Incremental option is better as it will not require a shutdown.

I do see a scenario where sequence of events can lead to loss of sequence
and generate duplicate sequence values, if subscriber starts consuming
sequences while publisher is also consuming them. In such cases, subscriber
shall not be allowed sequence consumption.



-- 
Kind Regards,
Yogesh Sharma
Open Source Enthusiast and Advocate
PostgreSQL Contributors Team @ RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Tue, Jun 4, 2024 at 7:40 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> 3. Replicate published sequences via walsender at the time of shutdown
>> or incrementally while decoding checkpoint record. The two ways to
>> achieve this are: (a) WAL log a special NOOP record just before
>> shutting down checkpointer. Then allow the WALsender to read the
>> sequence data and send it to the subscriber while decoding the new
>> NOOP record. (b) Similar to the previous idea but instead of WAL
>> logging a new record directly invokes a decoding callback after
>> walsender receives a request to shutdown which will allow pgoutput to
>> read and send required sequences. This approach has a drawback that we
>> are adding more work at the time of shutdown but note that we already
>> waits for all the WAL records to be decoded and sent before shutting
>> down the walsender during shutdown of the node.
>>
>> Any other ideas?
>>
>
> In case of primary crash the sequence won't get replicated. That is true even with the previous approach in case
walsenderis shut down because of a crash, but it is more serious with this approach. 
>

Right, but if we just want to support a major version upgrade scenario
then this should be fine because upgrades require a clean shutdown.

>
 How about periodically sending this information?
>

Now, if we want to support some sort of failover then probably this
will help. Do you have that use case in mind? If we want to send
periodically then we can do it when decoding checkpoint
(XLOG_CHECKPOINT_ONLINE) or some other periodic WAL record like
running_xacts (XLOG_RUNNING_XACTS).

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma
<yogesh.sharma@catprosystems.com> wrote:
>
> On 6/4/24 06:57, Amit Kapila wrote:
>
> > 2. Provide a command say Alter Subscription ...  Replicate Sequences
> > (or something like that) which users can perform before shutdown of
> > the publisher node during upgrade. This will allow copying all the
> > sequences from the publisher node to the subscriber node directly.
> > Similar to previous approach, this could also be inconvenient for
> > users.
>
> This is similar to option 1 except that it is a SQL command now.
>

Right, but I would still prefer a command as it provides clear steps
for the upgrade. Users need to perform (a) Replicate Sequences for a
particular subscription (b) Disable that subscription (c) Perform (a)
and (b) for all the subscriptions corresponding to the publisher we
want to shut down for upgrade.

I agree there are some manual steps involved here but it is advisable
for users to ensure that they have received the required data on the
subscriber before the upgrade of the publisher node, otherwise, they
may not be able to continue replication after the upgrade. For
example, see the "Prepare for publisher upgrades" step in pg_upgrade
docs [1].

>
> > 3. Replicate published sequences via walsender at the time of shutdown
> > or incrementally while decoding checkpoint record. The two ways to
> > achieve this are: (a) WAL log a special NOOP record just before
> > shutting down checkpointer. Then allow the WALsender to read the
> > sequence data and send it to the subscriber while decoding the new
> > NOOP record. (b) Similar to the previous idea but instead of WAL
> > logging a new record directly invokes a decoding callback after
> > walsender receives a request to shutdown which will allow pgoutput to
> > read and send required sequences. This approach has a drawback that we
> > are adding more work at the time of shutdown but note that we already
> > waits for all the WAL records to be decoded and sent before shutting
> > down the walsender during shutdown of the node.
>
> At the time of shutdown a) most logical upgrades don't necessarily call
> for shutdown
>

Won't the major version upgrade expect that the node is down? Refer to
step "Stop both servers" in [1].

>
 b) it will still add to total downtime with large set of
> sequences. Incremental option is better as it will not require a shutdown.
>
> I do see a scenario where sequence of events can lead to loss of sequence
> and generate duplicate sequence values, if subscriber starts consuming
> sequences while publisher is also consuming them. In such cases, subscriber
> shall not be allowed sequence consumption.
>

It would be fine to not allow subscribers to consume sequences that
are being logically replicated but what about the cases where we
haven't sent the latest values of sequences before the shutdown of the
publisher? In such a case, the publisher would have already consumed
some values that wouldn't have been sent to the subscriber and now
when the publisher is down then even if we re-allow the sequence
values to be consumed from the subscriber, it can lead to duplicate
values.

[1] - https://www.postgresql.org/docs/devel/pgupgrade.html

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Peter Eisentraut
Дата:
On 04.06.24 12:57, Amit Kapila wrote:
> 2. Provide a command say Alter Subscription ...  Replicate Sequences
> (or something like that) which users can perform before shutdown of
> the publisher node during upgrade. This will allow copying all the
> sequences from the publisher node to the subscriber node directly.
> Similar to previous approach, this could also be inconvenient for
> users.

I would start with this.  In any case, you're going to need to write 
code to collect all the sequence values, send them over some protocol, 
apply them on the subscriber.  The easiest way to start is to trigger 
that manually.  Then later you can add other ways to trigger it, either 
by timer or around shutdown, or whatever other ideas there might be.



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Wed, Jun 5, 2024 at 9:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma
> <yogesh.sharma@catprosystems.com> wrote:
> >
> > On 6/4/24 06:57, Amit Kapila wrote:
> >
> > > 2. Provide a command say Alter Subscription ...  Replicate Sequences
> > > (or something like that) which users can perform before shutdown of
> > > the publisher node during upgrade. This will allow copying all the
> > > sequences from the publisher node to the subscriber node directly.
> > > Similar to previous approach, this could also be inconvenient for
> > > users.
> >
> > This is similar to option 1 except that it is a SQL command now.
> >
>
> Right, but I would still prefer a command as it provides clear steps
> for the upgrade. Users need to perform (a) Replicate Sequences for a
> particular subscription (b) Disable that subscription (c) Perform (a)
> and (b) for all the subscriptions corresponding to the publisher we
> want to shut down for upgrade.
>

Another advantage of this approach over just a plain tool to copy all
sequences before upgrade is that here we can have the facility to copy
just the required sequences. I mean the set sequences that the user
has specified as part of the publication.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Bharath Rupireddy
Дата:
Hi,

On Tue, Jun 4, 2024 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Even if we decode it periodically (say each time we decode the
> checkpoint record) then also we need to send the entire set of
> sequences at shutdown. This is because the sequences may have changed
> from the last time we sent them.

Agree. How about decoding and sending only the sequences that are
changed from the last time when they were sent? I know it requires a
bit of tracking and more work, but all I'm looking for is to reduce
the amount of work that walsenders need to do during the shutdown.

Having said that, I like the idea of letting the user sync the
sequences via ALTER SUBSCRIPTION command and not weave the logic into
the shutdown checkpoint path. As Peter Eisentraut said here
https://www.postgresql.org/message-id/42e5cb35-4aeb-4f58-8091-90619c7c3ecc%40eisentraut.org,
this can be a good starting point to get going.

> > I can imagine a case where there are tens
> > of thousands of sequences in a production server, and surely decoding
> > and sending them just during the shutdown can take a lot of time
> > hampering the overall server uptime.
>
> It is possible but we will send only the sequences that belong to
> publications for which walsender is supposed to send the required
> data.

Right, but what if all the publication tables can have tens of
thousands of sequences.

> Now, we can also imagine providing option 2 (Alter Subscription
> ... Replicate Sequences) so that users can replicate sequences before
> shutdown and then disable the subscriptions so that there won't be a
> corresponding walsender.

As stated above, I like this idea to start with.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Wed, Jun 5, 2024 at 12:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 04.06.24 12:57, Amit Kapila wrote:
> > 2. Provide a command say Alter Subscription ...  Replicate Sequences
> > (or something like that) which users can perform before shutdown of
> > the publisher node during upgrade. This will allow copying all the
> > sequences from the publisher node to the subscriber node directly.
> > Similar to previous approach, this could also be inconvenient for
> > users.
>
> I would start with this.  In any case, you're going to need to write
> code to collect all the sequence values, send them over some protocol,
> apply them on the subscriber.  The easiest way to start is to trigger
> that manually.  Then later you can add other ways to trigger it, either
> by timer or around shutdown, or whatever other ideas there might be.
>

Agreed. To achieve this, we can allow sequences to be copied during
the initial CREATE SUBSCRIPTION command similar to what we do for
tables. And then later by new/existing command, we re-copy the already
existing sequences on the subscriber.

The options for the new command could be:
Alter Subscription ... Refresh Sequences
Alter Subscription ... Replicate Sequences

In the second option, we need to introduce a new keyword Replicate.
Can you think of any better option?

In addition to the above, the command Alter Subscription .. Refresh
Publication will fetch any missing sequences similar to what it does
for tables.

Thoughts?

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Ashutosh Bapat
Дата:


On Wed, Jun 5, 2024 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 4, 2024 at 7:40 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> 3. Replicate published sequences via walsender at the time of shutdown
>> or incrementally while decoding checkpoint record. The two ways to
>> achieve this are: (a) WAL log a special NOOP record just before
>> shutting down checkpointer. Then allow the WALsender to read the
>> sequence data and send it to the subscriber while decoding the new
>> NOOP record. (b) Similar to the previous idea but instead of WAL
>> logging a new record directly invokes a decoding callback after
>> walsender receives a request to shutdown which will allow pgoutput to
>> read and send required sequences. This approach has a drawback that we
>> are adding more work at the time of shutdown but note that we already
>> waits for all the WAL records to be decoded and sent before shutting
>> down the walsender during shutdown of the node.
>>
>> Any other ideas?
>>
>
> In case of primary crash the sequence won't get replicated. That is true even with the previous approach in case walsender is shut down because of a crash, but it is more serious with this approach.
>

Right, but if we just want to support a major version upgrade scenario
then this should be fine because upgrades require a clean shutdown.

>
 How about periodically sending this information?
>

Now, if we want to support some sort of failover then probably this
will help. Do you have that use case in mind?

Regular failover was a goal for supporting logical replication of sequences. That might be more common than major upgrade scenario.
 
If we want to send
periodically then we can do it when decoding checkpoint
(XLOG_CHECKPOINT_ONLINE) or some other periodic WAL record like
running_xacts (XLOG_RUNNING_XACTS).


Yeah. I am thinking along those lines.

It must be noted, however, that none of those optional make sure that the replicated sequence's states are consistent with the replicated object state which use those sequences. E.g. table t1 uses sequence s1. By last sequence replication, as of time T1, let's say t1 had consumed values upto vl1 from s1. But later, by time T2, it consumed values upto vl2 which were not replicated but the changes to t1 by T2 were replicated. If failover happens at that point, INSERTs on t1 would fail because of duplicate keys (values between vl1 and vl2). Previous attempt to support logical sequence replication solved this problem by replicating a future state of sequences (current value +/- log count). Similarly, if the sequence was ALTERed between T1 and T2, the state of sequence on replica would be inconsistent with the state of t1. Failing over at this stage, might end t1 in an inconsistent state.

--
Best Wishes,
Ashutosh Bapat

Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Wed, Jun 5, 2024 at 6:01 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>  How about periodically sending this information?
>> >
>>
>> Now, if we want to support some sort of failover then probably this
>> will help. Do you have that use case in mind?
>
>
> Regular failover was a goal for supporting logical replication of sequences. That might be more common than major
upgradescenario. 
>

We can't support regular failovers to subscribers unless we can
replicate/copy slots because the existing nodes connected to the
current publisher/primary would expect that. It should be primarily
useful for major version upgrades at this stage.

>>
>> If we want to send
>> periodically then we can do it when decoding checkpoint
>> (XLOG_CHECKPOINT_ONLINE) or some other periodic WAL record like
>> running_xacts (XLOG_RUNNING_XACTS).
>>
>
> Yeah. I am thinking along those lines.
>
> It must be noted, however, that none of those optional make sure that the replicated sequence's states are consistent
withthe replicated object state which use those sequences. 
>

Right, I feel as others are advocating, it seems better to support it
manually via command and then later we can extend it to do at shutdown
or at some regular intervals. If we do that then we should be able to
support major version upgrades and planned switchover.


--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Wed, Jun 5, 2024 at 12:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma
> <yogesh.sharma@catprosystems.com> wrote:
> >
> > On 6/4/24 06:57, Amit Kapila wrote:
> >
> > > 2. Provide a command say Alter Subscription ...  Replicate Sequences
> > > (or something like that) which users can perform before shutdown of
> > > the publisher node during upgrade. This will allow copying all the
> > > sequences from the publisher node to the subscriber node directly.
> > > Similar to previous approach, this could also be inconvenient for
> > > users.
> >
> > This is similar to option 1 except that it is a SQL command now.
> >
>
> Right, but I would still prefer a command as it provides clear steps
> for the upgrade. Users need to perform (a) Replicate Sequences for a
> particular subscription (b) Disable that subscription (c) Perform (a)
> and (b) for all the subscriptions corresponding to the publisher we
> want to shut down for upgrade.
>
> I agree there are some manual steps involved here but it is advisable
> for users to ensure that they have received the required data on the
> subscriber before the upgrade of the publisher node, otherwise, they
> may not be able to continue replication after the upgrade. For
> example, see the "Prepare for publisher upgrades" step in pg_upgrade
> docs [1].
>
> >
> > > 3. Replicate published sequences via walsender at the time of shutdown
> > > or incrementally while decoding checkpoint record. The two ways to
> > > achieve this are: (a) WAL log a special NOOP record just before
> > > shutting down checkpointer. Then allow the WALsender to read the
> > > sequence data and send it to the subscriber while decoding the new
> > > NOOP record. (b) Similar to the previous idea but instead of WAL
> > > logging a new record directly invokes a decoding callback after
> > > walsender receives a request to shutdown which will allow pgoutput to
> > > read and send required sequences. This approach has a drawback that we
> > > are adding more work at the time of shutdown but note that we already
> > > waits for all the WAL records to be decoded and sent before shutting
> > > down the walsender during shutdown of the node.
> >
> > At the time of shutdown a) most logical upgrades don't necessarily call
> > for shutdown
> >
>
> Won't the major version upgrade expect that the node is down? Refer to
> step "Stop both servers" in [1].

I think the idea is that the publisher is the old version and the
subscriber is the new version, and changes generated on the publisher
are replicated to the subscriber via logical replication. And at some
point, we change the application (or a router) settings so that no
more transactions come to the publisher, do the last upgrade
preparation work (e.g. copying the latest sequence values if
requried), and then change the application so that new transactions
come to the subscriber.

I remember the blog post about Knock doing a similar process to
upgrade the clusters with minimal downtime[1].

Regards,

[1] https://knock.app/blog/zero-downtime-postgres-upgrades


--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Wed, Jun 5, 2024 at 3:17 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Even if we decode it periodically (say each time we decode the
> > checkpoint record) then also we need to send the entire set of
> > sequences at shutdown. This is because the sequences may have changed
> > from the last time we sent them.
>
> Agree. How about decoding and sending only the sequences that are
> changed from the last time when they were sent? I know it requires a
> bit of tracking and more work, but all I'm looking for is to reduce
> the amount of work that walsenders need to do during the shutdown.
>

I see your point but going towards tracking the changed sequences
sounds like moving towards what we do for incremental backups unless
we can invent some other smart way.

> Having said that, I like the idea of letting the user sync the
> sequences via ALTER SUBSCRIPTION command and not weave the logic into
> the shutdown checkpoint path. As Peter Eisentraut said here
> https://www.postgresql.org/message-id/42e5cb35-4aeb-4f58-8091-90619c7c3ecc%40eisentraut.org,
> this can be a good starting point to get going.
>

Agreed.

> > > I can imagine a case where there are tens
> > > of thousands of sequences in a production server, and surely decoding
> > > and sending them just during the shutdown can take a lot of time
> > > hampering the overall server uptime.
> >
> > It is possible but we will send only the sequences that belong to
> > publications for which walsender is supposed to send the required
> > data.
>
> Right, but what if all the publication tables can have tens of
> thousands of sequences.
>

In such cases we have no option but to send all the sequences.

> > Now, we can also imagine providing option 2 (Alter Subscription
> > ... Replicate Sequences) so that users can replicate sequences before
> > shutdown and then disable the subscriptions so that there won't be a
> > corresponding walsender.
>
> As stated above, I like this idea to start with.
>

+1.


--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 12:51 PM Peter Eisentraut <peter@eisentraut.org> wrote:
> >
> > On 04.06.24 12:57, Amit Kapila wrote:
> > > 2. Provide a command say Alter Subscription ...  Replicate Sequences
> > > (or something like that) which users can perform before shutdown of
> > > the publisher node during upgrade. This will allow copying all the
> > > sequences from the publisher node to the subscriber node directly.
> > > Similar to previous approach, this could also be inconvenient for
> > > users.
> >
> > I would start with this.  In any case, you're going to need to write
> > code to collect all the sequence values, send them over some protocol,
> > apply them on the subscriber.  The easiest way to start is to trigger
> > that manually.  Then later you can add other ways to trigger it, either
> > by timer or around shutdown, or whatever other ideas there might be.
> >
>
> Agreed.

+1

> To achieve this, we can allow sequences to be copied during
> the initial CREATE SUBSCRIPTION command similar to what we do for
> tables. And then later by new/existing command, we re-copy the already
> existing sequences on the subscriber.
>
> The options for the new command could be:
> Alter Subscription ... Refresh Sequences
> Alter Subscription ... Replicate Sequences
>
> In the second option, we need to introduce a new keyword Replicate.
> Can you think of any better option?

Another idea is doing that using options. For example,

For initial sequences synchronization:

CREATE SUBSCRIPTION ... WITH (copy_sequence = true);

For re-copy (or update) sequences:

ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true);

>
> In addition to the above, the command Alter Subscription .. Refresh
> Publication will fetch any missing sequences similar to what it does
> for tables.

On the subscriber side, do we need to track which sequences are
created via CREATE/ALTER SUBSCRIPTION?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Thu, Jun 6, 2024 at 9:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 12:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma
> > <yogesh.sharma@catprosystems.com> wrote:
> > >
> > > On 6/4/24 06:57, Amit Kapila wrote:
> > >
> > > > 2. Provide a command say Alter Subscription ...  Replicate Sequences
> > > > (or something like that) which users can perform before shutdown of
> > > > the publisher node during upgrade. This will allow copying all the
> > > > sequences from the publisher node to the subscriber node directly.
> > > > Similar to previous approach, this could also be inconvenient for
> > > > users.
> > >
> > > This is similar to option 1 except that it is a SQL command now.
> > >
> >
> > Right, but I would still prefer a command as it provides clear steps
> > for the upgrade. Users need to perform (a) Replicate Sequences for a
> > particular subscription (b) Disable that subscription (c) Perform (a)
> > and (b) for all the subscriptions corresponding to the publisher we
> > want to shut down for upgrade.
> >
> > I agree there are some manual steps involved here but it is advisable
> > for users to ensure that they have received the required data on the
> > subscriber before the upgrade of the publisher node, otherwise, they
> > may not be able to continue replication after the upgrade. For
> > example, see the "Prepare for publisher upgrades" step in pg_upgrade
> > docs [1].
> >
> > >
> > > > 3. Replicate published sequences via walsender at the time of shutdown
> > > > or incrementally while decoding checkpoint record. The two ways to
> > > > achieve this are: (a) WAL log a special NOOP record just before
> > > > shutting down checkpointer. Then allow the WALsender to read the
> > > > sequence data and send it to the subscriber while decoding the new
> > > > NOOP record. (b) Similar to the previous idea but instead of WAL
> > > > logging a new record directly invokes a decoding callback after
> > > > walsender receives a request to shutdown which will allow pgoutput to
> > > > read and send required sequences. This approach has a drawback that we
> > > > are adding more work at the time of shutdown but note that we already
> > > > waits for all the WAL records to be decoded and sent before shutting
> > > > down the walsender during shutdown of the node.
> > >
> > > At the time of shutdown a) most logical upgrades don't necessarily call
> > > for shutdown
> > >
> >
> > Won't the major version upgrade expect that the node is down? Refer to
> > step "Stop both servers" in [1].
>
> I think the idea is that the publisher is the old version and the
> subscriber is the new version, and changes generated on the publisher
> are replicated to the subscriber via logical replication. And at some
> point, we change the application (or a router) settings so that no
> more transactions come to the publisher, do the last upgrade
> preparation work (e.g. copying the latest sequence values if
> requried), and then change the application so that new transactions
> come to the subscriber.
>

Okay, thanks for sharing the exact steps. If one has to follow that
path then sending incrementally (at checkpoint WAL or other times)
won't work because we want to ensure that the sequences are up-to-date
before the application starts using the new database. To do that in a
bullet-proof way, one has to copy/replicate sequences during the
requests to the new database are paused (Reference from the blog you
shared: For the first second after flipping the flag, our application
artificially paused any new database requests for one second.).
Currently, they are using some guesswork to replicate sequences that
require manual verification and more manual work for each sequence.
The new command (Alter Subscription ... Replicate Sequence) should
ease their procedure and can do things where they would require no or
very less verification.

> I remember the blog post about Knock doing a similar process to
> upgrade the clusters with minimal downtime[1].
>

Thanks for sharing the blog post.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > To achieve this, we can allow sequences to be copied during
> > the initial CREATE SUBSCRIPTION command similar to what we do for
> > tables. And then later by new/existing command, we re-copy the already
> > existing sequences on the subscriber.
> >
> > The options for the new command could be:
> > Alter Subscription ... Refresh Sequences
> > Alter Subscription ... Replicate Sequences
> >
> > In the second option, we need to introduce a new keyword Replicate.
> > Can you think of any better option?
>
> Another idea is doing that using options. For example,
>
> For initial sequences synchronization:
>
> CREATE SUBSCRIPTION ... WITH (copy_sequence = true);
>

How will it interact with the existing copy_data option? So copy_data
will become equivalent to copy_table_data, right?

> For re-copy (or update) sequences:
>
> ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true);
>

Similar to the previous point it can be slightly confusing w.r.t
copy_data. And would copy_sequence here mean that it would copy
sequence values of both pre-existing and newly added sequences, if so,
that would make it behave differently than copy_data? The other
possibility in this direction would be to introduce an option like
replicate_all_sequences/copy_all_sequences which indicates a copy of
both pre-existing and new sequences, if any.

If we want to go in the direction of having an option such as
copy_(all)_sequences then do you think specifying that copy_data is
just for tables in the docs would be sufficient? I am afraid that it
would be confusing for users.

> >
> > In addition to the above, the command Alter Subscription .. Refresh
> > Publication will fetch any missing sequences similar to what it does
> > for tables.
>
> On the subscriber side, do we need to track which sequences are
> created via CREATE/ALTER SUBSCRIPTION?
>

I think so unless we find some other way to know at refresh
publication time which all new sequences need to be part of the
subscription. What should be the behavior w.r.t sequences when the
user performs ALTER SUBSCRIPTION ... REFRESH PUBLICATION? I was
thinking similar to tables, it should fetch any missing sequence
information from the publisher.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Ashutosh Bapat
Дата:


On Thu, Jun 6, 2024 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jun 5, 2024 at 6:01 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>  How about periodically sending this information?
>> >
>>
>> Now, if we want to support some sort of failover then probably this
>> will help. Do you have that use case in mind?
>
>
> Regular failover was a goal for supporting logical replication of sequences. That might be more common than major upgrade scenario.
>

We can't support regular failovers to subscribers unless we can
replicate/copy slots because the existing nodes connected to the
current publisher/primary would expect that. It should be primarily
useful for major version upgrades at this stage.

We don't want to design it in a way that requires major rework when we are able to copy slots and then support regular failovers. That's when the consistency between a sequence and the table using it would be a must. So it's better that we take that into consideration now.

--
Best Wishes,
Ashutosh Bapat

Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Thu, Jun 6, 2024 at 3:44 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Jun 6, 2024 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> On Wed, Jun 5, 2024 at 6:01 PM Ashutosh Bapat
>> <ashutosh.bapat.oss@gmail.com> wrote:
>> >
>> > On Wed, Jun 5, 2024 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >>
>> >>  How about periodically sending this information?
>> >> >
>> >>
>> >> Now, if we want to support some sort of failover then probably this
>> >> will help. Do you have that use case in mind?
>> >
>> >
>> > Regular failover was a goal for supporting logical replication of sequences. That might be more common than major
upgradescenario. 
>> >
>>
>> We can't support regular failovers to subscribers unless we can
>> replicate/copy slots because the existing nodes connected to the
>> current publisher/primary would expect that. It should be primarily
>> useful for major version upgrades at this stage.
>
>
> We don't want to design it in a way that requires major rework when we are able to copy slots and then support
regularfailover. 
>

I don't think we can just copy slots like we do for standbys. The
slots would require WAL locations to continue, so not sure if we can
make it work for failover for subscribers.

>
 That's when the consistency between a sequence and the table using it
would be a must. So it's better that we take that into consideration
now.
>

With the ideas being discussed here, I could only see the use case of
a major version upgrade or planned switchover to work. If we come up
with any other agreeable way that is better than this then we can
consider the same.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Dilip Kumar
Дата:
On Thu, Jun 6, 2024 at 9:34 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 3:17 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Tue, Jun 4, 2024 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > Even if we decode it periodically (say each time we decode the
> > > checkpoint record) then also we need to send the entire set of
> > > sequences at shutdown. This is because the sequences may have changed
> > > from the last time we sent them.
> >
> > Agree. How about decoding and sending only the sequences that are
> > changed from the last time when they were sent? I know it requires a
> > bit of tracking and more work, but all I'm looking for is to reduce
> > the amount of work that walsenders need to do during the shutdown.
> >
>
> I see your point but going towards tracking the changed sequences
> sounds like moving towards what we do for incremental backups unless
> we can invent some other smart way.

Yes, we would need an entirely new infrastructure to track the
sequence change since the last sync. We can only determine this from
WAL, and relying on it would somehow bring us back to the approach we
were trying to achieve with logical decoding of sequences patch.

> > Having said that, I like the idea of letting the user sync the
> > sequences via ALTER SUBSCRIPTION command and not weave the logic into
> > the shutdown checkpoint path. As Peter Eisentraut said here
> > https://www.postgresql.org/message-id/42e5cb35-4aeb-4f58-8091-90619c7c3ecc%40eisentraut.org,
> > this can be a good starting point to get going.
> >
>
> Agreed.

+1

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Thu, Jun 6, 2024 at 6:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > To achieve this, we can allow sequences to be copied during
> > > the initial CREATE SUBSCRIPTION command similar to what we do for
> > > tables. And then later by new/existing command, we re-copy the already
> > > existing sequences on the subscriber.
> > >
> > > The options for the new command could be:
> > > Alter Subscription ... Refresh Sequences
> > > Alter Subscription ... Replicate Sequences
> > >
> > > In the second option, we need to introduce a new keyword Replicate.
> > > Can you think of any better option?
> >
> > Another idea is doing that using options. For example,
> >
> > For initial sequences synchronization:
> >
> > CREATE SUBSCRIPTION ... WITH (copy_sequence = true);
> >
>
> How will it interact with the existing copy_data option? So copy_data
> will become equivalent to copy_table_data, right?

Right.

>
> > For re-copy (or update) sequences:
> >
> > ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true);
> >
>
> Similar to the previous point it can be slightly confusing w.r.t
> copy_data. And would copy_sequence here mean that it would copy
> sequence values of both pre-existing and newly added sequences, if so,
> that would make it behave differently than copy_data?  The other
> possibility in this direction would be to introduce an option like
> replicate_all_sequences/copy_all_sequences which indicates a copy of
> both pre-existing and new sequences, if any.

Copying sequence data works differently than replicating table data
(initial data copy and logical replication). So I thought the
copy_sequence option (or whatever better name) always does both
updating pre-existing sequences and adding new sequences. REFRESH
PUBLICATION updates the tables to be subscribed, so we also update or
add sequences associated to these tables.

>
> If we want to go in the direction of having an option such as
> copy_(all)_sequences then do you think specifying that copy_data is
> just for tables in the docs would be sufficient? I am afraid that it
> would be confusing for users.

I see your point. But I guess it would not be very problematic as it
doesn't break the current behavior and copy_(all)_sequences is
primarily for upgrade use cases.

>
> > >
> > > In addition to the above, the command Alter Subscription .. Refresh
> > > Publication will fetch any missing sequences similar to what it does
> > > for tables.
> >
> > On the subscriber side, do we need to track which sequences are
> > created via CREATE/ALTER SUBSCRIPTION?
> >
>
> I think so unless we find some other way to know at refresh
> publication time which all new sequences need to be part of the
> subscription. What should be the behavior w.r.t sequences when the
> user performs ALTER SUBSCRIPTION ... REFRESH PUBLICATION? I was
> thinking similar to tables, it should fetch any missing sequence
> information from the publisher.

It seems to make sense to me. But I have one question: do we want to
support replicating sequences that are not associated with any tables?
if yes, what if we refresh two different subscriptions that subscribe
to different tables on the same database? On the other hand, if no
(i.e. replicating only sequences owned by tables), can we know which
sequences to replicate by checking the subscribed tables?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Fri, Jun 7, 2024 at 7:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jun 6, 2024 at 6:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > >
> > > > To achieve this, we can allow sequences to be copied during
> > > > the initial CREATE SUBSCRIPTION command similar to what we do for
> > > > tables. And then later by new/existing command, we re-copy the already
> > > > existing sequences on the subscriber.
> > > >
> > > > The options for the new command could be:
> > > > Alter Subscription ... Refresh Sequences
> > > > Alter Subscription ... Replicate Sequences
> > > >
> > > > In the second option, we need to introduce a new keyword Replicate.
> > > > Can you think of any better option?
> > >
> > > Another idea is doing that using options. For example,
> > >
> > > For initial sequences synchronization:
> > >
> > > CREATE SUBSCRIPTION ... WITH (copy_sequence = true);
> > >
> >
> > How will it interact with the existing copy_data option? So copy_data
> > will become equivalent to copy_table_data, right?
>
> Right.
>
> >
> > > For re-copy (or update) sequences:
> > >
> > > ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true);
> > >
> >
> > Similar to the previous point it can be slightly confusing w.r.t
> > copy_data. And would copy_sequence here mean that it would copy
> > sequence values of both pre-existing and newly added sequences, if so,
> > that would make it behave differently than copy_data?  The other
> > possibility in this direction would be to introduce an option like
> > replicate_all_sequences/copy_all_sequences which indicates a copy of
> > both pre-existing and new sequences, if any.
>
> Copying sequence data works differently than replicating table data
> (initial data copy and logical replication). So I thought the
> copy_sequence option (or whatever better name) always does both
> updating pre-existing sequences and adding new sequences. REFRESH
> PUBLICATION updates the tables to be subscribed, so we also update or
> add sequences associated to these tables.
>

Are you imagining the behavior for sequences associated with tables
differently than the ones defined by the CREATE SEQUENCE .. command? I
was thinking that users would associate sequences with publications
similar to what we do for tables for both cases. For example, they
need to explicitly mention the sequences they want to replicate by
commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE
PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR
SEQUENCES IN SCHEMA sch1;

In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1
should copy both the explicitly defined sequences and sequences
defined with the tables. Do you think a different variant for just
copying sequences implicitly associated with tables (say for identity
columns)?

>
> >
> > > >
> > > > In addition to the above, the command Alter Subscription .. Refresh
> > > > Publication will fetch any missing sequences similar to what it does
> > > > for tables.
> > >
> > > On the subscriber side, do we need to track which sequences are
> > > created via CREATE/ALTER SUBSCRIPTION?
> > >
> >
> > I think so unless we find some other way to know at refresh
> > publication time which all new sequences need to be part of the
> > subscription. What should be the behavior w.r.t sequences when the
> > user performs ALTER SUBSCRIPTION ... REFRESH PUBLICATION? I was
> > thinking similar to tables, it should fetch any missing sequence
> > information from the publisher.
>
> It seems to make sense to me. But I have one question: do we want to
> support replicating sequences that are not associated with any tables?
>

Yes, unless we see a problem with it.

> if yes, what if we refresh two different subscriptions that subscribe
> to different tables on the same database?

What problem do you see with it?

>
 On the other hand, if no
> (i.e. replicating only sequences owned by tables), can we know which
> sequences to replicate by checking the subscribed tables?
>

Sorry, I didn't understand your question. Can you please try to
explain in more words or use some examples?

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 9:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma
> > <yogesh.sharma@catprosystems.com> wrote:
> > >
> > > On 6/4/24 06:57, Amit Kapila wrote:
> > >
> > > > 2. Provide a command say Alter Subscription ...  Replicate Sequences
> > > > (or something like that) which users can perform before shutdown of
> > > > the publisher node during upgrade. This will allow copying all the
> > > > sequences from the publisher node to the subscriber node directly.
> > > > Similar to previous approach, this could also be inconvenient for
> > > > users.
> > >
> > > This is similar to option 1 except that it is a SQL command now.
> > >
> >
> > Right, but I would still prefer a command as it provides clear steps
> > for the upgrade. Users need to perform (a) Replicate Sequences for a
> > particular subscription (b) Disable that subscription (c) Perform (a)
> > and (b) for all the subscriptions corresponding to the publisher we
> > want to shut down for upgrade.
> >
>
> Another advantage of this approach over just a plain tool to copy all
> sequences before upgrade is that here we can have the facility to copy
> just the required sequences. I mean the set sequences that the user
> has specified as part of the publication.

Here is a WIP patch to handle synchronizing the sequence during
create/alter subscription. The following changes were made for it:
Subscriber modifications:
Enable sequence synchronization during subscription creation or
alteration using the following syntax:
CREATE SUBSCRIPTION ... WITH (sequences=true);
When a subscription is created with the sequence option enabled, the
sequence list from the specified publications in the subscription will
be retrieved from the publisher. Each sequence's data will then be
copied from the remote publisher sequence to the local subscriber
sequence by using a wal receiver connection. Since all of the sequence
updating is done within a single transaction, if any errors occur
during the copying process, the entire transaction will be rolled
back.

To refresh sequences, use the syntax:
ALTER SUBSCRIPTION REFRESH SEQUENCES;
During sequence refresh, the sequence list is updated by removing
stale sequences and adding any missing sequences. The updated sequence
list is then re-synchronized.

A new catalog table, pg_subscription_seq, has been introduced for
mapping subscriptions to sequences. Additionally, the sequence LSN
(Log Sequence Number) is stored, facilitating determination of
sequence changes occurring before or after the returned sequence
state.

I have taken some code changes from Tomas's patch at [1].
I'll adjust the syntax as needed based on the ongoing discussion at  [2].

[1] - https://www.postgresql.org/message-id/09613730-5ee9-4cc3-82d8-f089be90aa64%40enterprisedb.com
[2] - https://www.postgresql.org/message-id/CAA4eK1K2X%2BPaErtGVQPD0k_5XqxjV_Cwg37%2B-pWsmKFncwc7Wg%40mail.gmail.com

Regards,
Vignesh

Вложения

Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jun 7, 2024 at 7:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Jun 6, 2024 at 6:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > >
> > > > > To achieve this, we can allow sequences to be copied during
> > > > > the initial CREATE SUBSCRIPTION command similar to what we do for
> > > > > tables. And then later by new/existing command, we re-copy the already
> > > > > existing sequences on the subscriber.
> > > > >
> > > > > The options for the new command could be:
> > > > > Alter Subscription ... Refresh Sequences
> > > > > Alter Subscription ... Replicate Sequences
> > > > >
> > > > > In the second option, we need to introduce a new keyword Replicate.
> > > > > Can you think of any better option?
> > > >
> > > > Another idea is doing that using options. For example,
> > > >
> > > > For initial sequences synchronization:
> > > >
> > > > CREATE SUBSCRIPTION ... WITH (copy_sequence = true);
> > > >
> > >
> > > How will it interact with the existing copy_data option? So copy_data
> > > will become equivalent to copy_table_data, right?
> >
> > Right.
> >
> > >
> > > > For re-copy (or update) sequences:
> > > >
> > > > ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true);
> > > >
> > >
> > > Similar to the previous point it can be slightly confusing w.r.t
> > > copy_data. And would copy_sequence here mean that it would copy
> > > sequence values of both pre-existing and newly added sequences, if so,
> > > that would make it behave differently than copy_data?  The other
> > > possibility in this direction would be to introduce an option like
> > > replicate_all_sequences/copy_all_sequences which indicates a copy of
> > > both pre-existing and new sequences, if any.
> >
> > Copying sequence data works differently than replicating table data
> > (initial data copy and logical replication). So I thought the
> > copy_sequence option (or whatever better name) always does both
> > updating pre-existing sequences and adding new sequences. REFRESH
> > PUBLICATION updates the tables to be subscribed, so we also update or
> > add sequences associated to these tables.
> >
>
> Are you imagining the behavior for sequences associated with tables
> differently than the ones defined by the CREATE SEQUENCE .. command? I
> was thinking that users would associate sequences with publications
> similar to what we do for tables for both cases. For example, they
> need to explicitly mention the sequences they want to replicate by
> commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE
> PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR
> SEQUENCES IN SCHEMA sch1;
>
> In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1
> should copy both the explicitly defined sequences and sequences
> defined with the tables. Do you think a different variant for just
> copying sequences implicitly associated with tables (say for identity
> columns)?

Oh, I was thinking that your proposal was to copy literally all
sequences by REPLICA/REFRESH SEQUENCE command. But it seems to make
sense to explicitly specify the sequences they want to replicate. It
also means that they can create a publication that has only sequences.
In this case, even if they create a subscription for that publication,
we don't launch any apply workers for that subscription. Right?

Also, given that the main use case (at least as the first step) is
version upgrade, do we really need to support SEQUENCES IN SCHEMA and
even FOR SEQUENCE? The WIP patch Vignesh recently submitted is more
than 6k lines. I think we can cut the scope for the first
implementation so as to make the review easy.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Amul Sul
Дата:


On Sat, Jun 8, 2024 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
[...]
A new catalog table, pg_subscription_seq, has been introduced for
mapping subscriptions to sequences. Additionally, the sequence LSN
(Log Sequence Number) is stored, facilitating determination of
sequence changes occurring before or after the returned sequence
state.
 
Can't it be done using pg_depend? It seems a bit excessive unless I'm missing
something. How do you track sequence mapping with the publication?

Regards,
Amul

Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jun 7, 2024 at 7:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Jun 6, 2024 at 6:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > >
> > > > > > To achieve this, we can allow sequences to be copied during
> > > > > > the initial CREATE SUBSCRIPTION command similar to what we do for
> > > > > > tables. And then later by new/existing command, we re-copy the already
> > > > > > existing sequences on the subscriber.
> > > > > >
> > > > > > The options for the new command could be:
> > > > > > Alter Subscription ... Refresh Sequences
> > > > > > Alter Subscription ... Replicate Sequences
> > > > > >
> > > > > > In the second option, we need to introduce a new keyword Replicate.
> > > > > > Can you think of any better option?
> > > > >
> > > > > Another idea is doing that using options. For example,
> > > > >
> > > > > For initial sequences synchronization:
> > > > >
> > > > > CREATE SUBSCRIPTION ... WITH (copy_sequence = true);
> > > > >
> > > >
> > > > How will it interact with the existing copy_data option? So copy_data
> > > > will become equivalent to copy_table_data, right?
> > >
> > > Right.
> > >
> > > >
> > > > > For re-copy (or update) sequences:
> > > > >
> > > > > ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true);
> > > > >
> > > >
> > > > Similar to the previous point it can be slightly confusing w.r.t
> > > > copy_data. And would copy_sequence here mean that it would copy
> > > > sequence values of both pre-existing and newly added sequences, if so,
> > > > that would make it behave differently than copy_data?  The other
> > > > possibility in this direction would be to introduce an option like
> > > > replicate_all_sequences/copy_all_sequences which indicates a copy of
> > > > both pre-existing and new sequences, if any.
> > >
> > > Copying sequence data works differently than replicating table data
> > > (initial data copy and logical replication). So I thought the
> > > copy_sequence option (or whatever better name) always does both
> > > updating pre-existing sequences and adding new sequences. REFRESH
> > > PUBLICATION updates the tables to be subscribed, so we also update or
> > > add sequences associated to these tables.
> > >
> >
> > Are you imagining the behavior for sequences associated with tables
> > differently than the ones defined by the CREATE SEQUENCE .. command? I
> > was thinking that users would associate sequences with publications
> > similar to what we do for tables for both cases. For example, they
> > need to explicitly mention the sequences they want to replicate by
> > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE
> > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR
> > SEQUENCES IN SCHEMA sch1;
> >
> > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1
> > should copy both the explicitly defined sequences and sequences
> > defined with the tables. Do you think a different variant for just
> > copying sequences implicitly associated with tables (say for identity
> > columns)?
>
> Oh, I was thinking that your proposal was to copy literally all
> sequences by REPLICA/REFRESH SEQUENCE command. But it seems to make
> sense to explicitly specify the sequences they want to replicate. It
> also means that they can create a publication that has only sequences.
> In this case, even if they create a subscription for that publication,
> we don't launch any apply workers for that subscription. Right?
>
> Also, given that the main use case (at least as the first step) is
> version upgrade, do we really need to support SEQUENCES IN SCHEMA and
> even FOR SEQUENCE?

Also, I guess that specifying individual sequences might not be easy
to use for users in some cases. For sequences owned by a column of a
table, users might want to specify them altogether, rather than
separately. For example, CREATE PUBLICATION ... FOR TABLE tab1 WITH
SEQUENCES means to add the table tab1 and its sequences to the
publication. For other sequences (i.e., not owned by any tables),
users might want to specify them individually.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Are you imagining the behavior for sequences associated with tables
> > > differently than the ones defined by the CREATE SEQUENCE .. command? I
> > > was thinking that users would associate sequences with publications
> > > similar to what we do for tables for both cases. For example, they
> > > need to explicitly mention the sequences they want to replicate by
> > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE
> > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR
> > > SEQUENCES IN SCHEMA sch1;
> > >
> > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1
> > > should copy both the explicitly defined sequences and sequences
> > > defined with the tables. Do you think a different variant for just
> > > copying sequences implicitly associated with tables (say for identity
> > > columns)?
> >
> > Oh, I was thinking that your proposal was to copy literally all
> > sequences by REPLICA/REFRESH SEQUENCE command.
> >

I am trying to keep the behavior as close to tables as possible.

> > But it seems to make
> > sense to explicitly specify the sequences they want to replicate. It
> > also means that they can create a publication that has only sequences.
> > In this case, even if they create a subscription for that publication,
> > we don't launch any apply workers for that subscription. Right?
> >

Right, good point. I had not thought about this.

> > Also, given that the main use case (at least as the first step) is
> > version upgrade, do we really need to support SEQUENCES IN SCHEMA and
> > even FOR SEQUENCE?
>

At the very least, we can split the patch to move these variants to a
separate patch. Once the main patch is finalized, we can try to
evaluate the remaining separately.

> Also, I guess that specifying individual sequences might not be easy
> to use for users in some cases. For sequences owned by a column of a
> table, users might want to specify them altogether, rather than
> separately. For example, CREATE PUBLICATION ... FOR TABLE tab1 WITH
> SEQUENCES means to add the table tab1 and its sequences to the
> publication. For other sequences (i.e., not owned by any tables),
> users might want to specify them individually.
>

Yeah, or we can have a syntax like CREATE PUBLICATION ... FOR TABLE
tab1 INCLUDE SEQUENCES.  Normally, we use the WITH clause for options
(For example, CREATE SUBSCRIPTION ... WITH (streaming=...)).

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Mon, 10 Jun 2024 at 12:24, Amul Sul <sulamul@gmail.com> wrote:
>
>
>
> On Sat, Jun 8, 2024 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote:
>>
>> On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> [...]
>> A new catalog table, pg_subscription_seq, has been introduced for
>> mapping subscriptions to sequences. Additionally, the sequence LSN
>> (Log Sequence Number) is stored, facilitating determination of
>> sequence changes occurring before or after the returned sequence
>> state.
>
>
> Can't it be done using pg_depend? It seems a bit excessive unless I'm missing
> something.

We'll require the lsn because the sequence LSN informs the user that
it has been synchronized up to the LSN in pg_subscription_seq. Since
we are not supporting incremental sync, the user will be able to
identify if he should run refresh sequences or not by checking the lsn
of the pg_subscription_seq and the lsn of the sequence(using
pg_sequence_state added) in the publisher.  Also, this parallels our
implementation for pg_subscription_seq and will aid in expanding for
a) incremental synchronization and b) utilizing workers for
synchronization using sequence states if necessary.

How do you track sequence mapping with the publication?

In the publisher we use pg_publication_rel and
pg_publication_namespace for mapping the sequences with the
publication.

Regards,
Vignesh
Vignesh



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Mon, 10 Jun 2024 at 14:48, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Are you imagining the behavior for sequences associated with tables
> > > > differently than the ones defined by the CREATE SEQUENCE .. command? I
> > > > was thinking that users would associate sequences with publications
> > > > similar to what we do for tables for both cases. For example, they
> > > > need to explicitly mention the sequences they want to replicate by
> > > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE
> > > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR
> > > > SEQUENCES IN SCHEMA sch1;
> > > >
> > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1
> > > > should copy both the explicitly defined sequences and sequences
> > > > defined with the tables. Do you think a different variant for just
> > > > copying sequences implicitly associated with tables (say for identity
> > > > columns)?
> > >
> > > Oh, I was thinking that your proposal was to copy literally all
> > > sequences by REPLICA/REFRESH SEQUENCE command.
> > >
>
> I am trying to keep the behavior as close to tables as possible.
>
> > > But it seems to make
> > > sense to explicitly specify the sequences they want to replicate. It
> > > also means that they can create a publication that has only sequences.
> > > In this case, even if they create a subscription for that publication,
> > > we don't launch any apply workers for that subscription. Right?
> > >
>
> Right, good point. I had not thought about this.
>
> > > Also, given that the main use case (at least as the first step) is
> > > version upgrade, do we really need to support SEQUENCES IN SCHEMA and
> > > even FOR SEQUENCE?
> >
>
> At the very least, we can split the patch to move these variants to a
> separate patch. Once the main patch is finalized, we can try to
> evaluate the remaining separately.

I engaged in an offline discussion with Amit about strategizing the
division of patches to facilitate the review process. We agreed on the
following split: The first patch will encompass the setting and
getting of sequence values (core sequence changes). The second patch
will cover all changes on the publisher side related to "FOR ALL
SEQUENCES." The third patch will address subscriber side changes aimed
at synchronizing "FOR ALL SEQUENCES" publications. The fourth patch
will focus on supporting "FOR SEQUENCE" publication. Lastly, the fifth
patch will introduce support for  "FOR ALL SEQUENCES IN SCHEMA"
publication.

I will work on this and share an updated patch for the same soon.

Regards,
Vignesh



Re: Logical Replication of sequences

От
Amul Sul
Дата:
On Mon, Jun 10, 2024 at 5:00 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, 10 Jun 2024 at 12:24, Amul Sul <sulamul@gmail.com> wrote:
>
>
>
> On Sat, Jun 8, 2024 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote:
>>
>> On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> [...]
>> A new catalog table, pg_subscription_seq, has been introduced for
>> mapping subscriptions to sequences. Additionally, the sequence LSN
>> (Log Sequence Number) is stored, facilitating determination of
>> sequence changes occurring before or after the returned sequence
>> state.
>
>
> Can't it be done using pg_depend? It seems a bit excessive unless I'm missing
> something.

We'll require the lsn because the sequence LSN informs the user that
it has been synchronized up to the LSN in pg_subscription_seq. Since
we are not supporting incremental sync, the user will be able to
identify if he should run refresh sequences or not by checking the lsn
of the pg_subscription_seq and the lsn of the sequence(using
pg_sequence_state added) in the publisher.  Also, this parallels our
implementation for pg_subscription_seq and will aid in expanding for
a) incremental synchronization and b) utilizing workers for
synchronization using sequence states if necessary.

How do you track sequence mapping with the publication?

In the publisher we use pg_publication_rel and
pg_publication_namespace for mapping the sequences with the
publication.

Thanks for the explanation. I'm wondering what the complexity would be, if we
wanted to do something similar on the subscriber side, i.e., tracking via
pg_subscription_rel.

Regards,
Amul

Re: Logical Replication of sequences

От
vignesh C
Дата:
On Tue, 11 Jun 2024 at 09:41, Amul Sul <sulamul@gmail.com> wrote:
>
> On Mon, Jun 10, 2024 at 5:00 PM vignesh C <vignesh21@gmail.com> wrote:
>>
>> On Mon, 10 Jun 2024 at 12:24, Amul Sul <sulamul@gmail.com> wrote:
>> >
>> >
>> >
>> > On Sat, Jun 8, 2024 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote:
>> >>
>> >> On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> >> [...]
>> >> A new catalog table, pg_subscription_seq, has been introduced for
>> >> mapping subscriptions to sequences. Additionally, the sequence LSN
>> >> (Log Sequence Number) is stored, facilitating determination of
>> >> sequence changes occurring before or after the returned sequence
>> >> state.
>> >
>> >
>> > Can't it be done using pg_depend? It seems a bit excessive unless I'm missing
>> > something.
>>
>> We'll require the lsn because the sequence LSN informs the user that
>> it has been synchronized up to the LSN in pg_subscription_seq. Since
>> we are not supporting incremental sync, the user will be able to
>> identify if he should run refresh sequences or not by checking the lsn
>> of the pg_subscription_seq and the lsn of the sequence(using
>> pg_sequence_state added) in the publisher.  Also, this parallels our
>> implementation for pg_subscription_seq and will aid in expanding for
>> a) incremental synchronization and b) utilizing workers for
>> synchronization using sequence states if necessary.
>>
>> How do you track sequence mapping with the publication?
>>
>> In the publisher we use pg_publication_rel and
>> pg_publication_namespace for mapping the sequences with the
>> publication.
>
>
> Thanks for the explanation. I'm wondering what the complexity would be, if we
> wanted to do something similar on the subscriber side, i.e., tracking via
> pg_subscription_rel.

Because we won't utilize sync workers to synchronize the sequence, and
the sequence won't necessitate sync states like init, sync,
finishedcopy, syncdone, ready, etc., initially, I considered keeping
the sequences separate. However, I'm ok with using pg_subscription_rel
as it could potentially help in enhancing incremental synchronization
and parallelizing later on.

Regards,
Vignesh



Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Tue, Jun 11, 2024 at 12:25 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Mon, 10 Jun 2024 at 14:48, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > Are you imagining the behavior for sequences associated with tables
> > > > > differently than the ones defined by the CREATE SEQUENCE .. command? I
> > > > > was thinking that users would associate sequences with publications
> > > > > similar to what we do for tables for both cases. For example, they
> > > > > need to explicitly mention the sequences they want to replicate by
> > > > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE
> > > > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR
> > > > > SEQUENCES IN SCHEMA sch1;
> > > > >
> > > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1
> > > > > should copy both the explicitly defined sequences and sequences
> > > > > defined with the tables. Do you think a different variant for just
> > > > > copying sequences implicitly associated with tables (say for identity
> > > > > columns)?
> > > >
> > > > Oh, I was thinking that your proposal was to copy literally all
> > > > sequences by REPLICA/REFRESH SEQUENCE command.
> > > >
> >
> > I am trying to keep the behavior as close to tables as possible.
> >
> > > > But it seems to make
> > > > sense to explicitly specify the sequences they want to replicate. It
> > > > also means that they can create a publication that has only sequences.
> > > > In this case, even if they create a subscription for that publication,
> > > > we don't launch any apply workers for that subscription. Right?
> > > >
> >
> > Right, good point. I had not thought about this.
> >
> > > > Also, given that the main use case (at least as the first step) is
> > > > version upgrade, do we really need to support SEQUENCES IN SCHEMA and
> > > > even FOR SEQUENCE?
> > >
> >
> > At the very least, we can split the patch to move these variants to a
> > separate patch. Once the main patch is finalized, we can try to
> > evaluate the remaining separately.
>
> I engaged in an offline discussion with Amit about strategizing the
> division of patches to facilitate the review process. We agreed on the
> following split: The first patch will encompass the setting and
> getting of sequence values (core sequence changes). The second patch
> will cover all changes on the publisher side related to "FOR ALL
> SEQUENCES." The third patch will address subscriber side changes aimed
> at synchronizing "FOR ALL SEQUENCES" publications. The fourth patch
> will focus on supporting "FOR SEQUENCE" publication. Lastly, the fifth
> patch will introduce support for  "FOR ALL SEQUENCES IN SCHEMA"
> publication.
>
> I will work on this and share an updated patch for the same soon.

+1. Sounds like a good plan.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Tue, 11 Jun 2024 at 12:38, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jun 11, 2024 at 12:25 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Mon, 10 Jun 2024 at 14:48, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > >
> > > > > > Are you imagining the behavior for sequences associated with tables
> > > > > > differently than the ones defined by the CREATE SEQUENCE .. command? I
> > > > > > was thinking that users would associate sequences with publications
> > > > > > similar to what we do for tables for both cases. For example, they
> > > > > > need to explicitly mention the sequences they want to replicate by
> > > > > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE
> > > > > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR
> > > > > > SEQUENCES IN SCHEMA sch1;
> > > > > >
> > > > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1
> > > > > > should copy both the explicitly defined sequences and sequences
> > > > > > defined with the tables. Do you think a different variant for just
> > > > > > copying sequences implicitly associated with tables (say for identity
> > > > > > columns)?
> > > > >
> > > > > Oh, I was thinking that your proposal was to copy literally all
> > > > > sequences by REPLICA/REFRESH SEQUENCE command.
> > > > >
> > >
> > > I am trying to keep the behavior as close to tables as possible.
> > >
> > > > > But it seems to make
> > > > > sense to explicitly specify the sequences they want to replicate. It
> > > > > also means that they can create a publication that has only sequences.
> > > > > In this case, even if they create a subscription for that publication,
> > > > > we don't launch any apply workers for that subscription. Right?
> > > > >
> > >
> > > Right, good point. I had not thought about this.
> > >
> > > > > Also, given that the main use case (at least as the first step) is
> > > > > version upgrade, do we really need to support SEQUENCES IN SCHEMA and
> > > > > even FOR SEQUENCE?
> > > >
> > >
> > > At the very least, we can split the patch to move these variants to a
> > > separate patch. Once the main patch is finalized, we can try to
> > > evaluate the remaining separately.
> >
> > I engaged in an offline discussion with Amit about strategizing the
> > division of patches to facilitate the review process. We agreed on the
> > following split: The first patch will encompass the setting and
> > getting of sequence values (core sequence changes). The second patch
> > will cover all changes on the publisher side related to "FOR ALL
> > SEQUENCES." The third patch will address subscriber side changes aimed
> > at synchronizing "FOR ALL SEQUENCES" publications. The fourth patch
> > will focus on supporting "FOR SEQUENCE" publication. Lastly, the fifth
> > patch will introduce support for  "FOR ALL SEQUENCES IN SCHEMA"
> > publication.
> >
> > I will work on this and share an updated patch for the same soon.
>
> +1. Sounds like a good plan.

Amit and I engaged in an offline discussion regarding the design and
contemplated that it could be like below:
1) CREATE PUBLICATION syntax enhancement:
CREATE PUBLICATION ... FOR ALL SEQUENCES;
The addition of a new column titled "all sequences" in the
pg_publication system table will signify whether the publication is
designated as all sequences publication or not.

2)  CREATE SUBSCRIPTION -- no syntax change.
Upon creation of a subscription, the following additional steps will
be managed by the subscriber:
i) The subscriber will retrieve the list of sequences associated with
the subscription's publications.
ii) For each sequence: a) Retrieve the sequence value from the
publisher by invoking the pg_sequence_state function. b) Set the
sequence with the value obtained from the publisher. iv) Once the
subscription creation is completed, all sequence values will become
visible at the subscriber's end.

An alternative design approach could involve retrieving the sequence
list from the publisher during subscription creation and inserting the
sequences with an "init" state into the pg_subscription_rel system
table. These tasks could be executed by a single sequence sync worker,
which would:
i) Retrieve the list of sequences in the "init" state from the
pg_subscription_rel system table.
ii) Initiate a transaction.
iii) For each sequence: a) Obtain the sequence value from the
publisher by utilizing the pg_sequence_state function. b) Update the
sequence with the value obtained from the publisher.
iv) Commit the transaction.

The benefit with the second approach is that if there are large number
of sequences, the sequence sync can be enhanced to happen in parallel
and also if there are any locks held on the sequences in the
publisher, the sequence worker can wait to acquire the lock instead of
blocking the whole create subscription command which will delay the
initial copy of the tables too.

3) Refreshing the sequence can be achieved through the existing
command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
here).
The subscriber identifies stale sequences, meaning sequences present
in pg_subscription_rel but absent from the publication, and removes
them from the pg_subscription_rel system table. The subscriber also
checks for newly added sequences in the publisher and synchronizes
their values from the publisher using the steps outlined in the
subscription creation process. It's worth noting that previously
synchronized sequences won't be synchronized again; the sequence sync
will occur solely for the newly added sequences.

4) Introducing a new command for refreshing all sequences: ALTER
SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
The subscriber will remove stale sequences and add newly added
sequences from the publisher. Following this, it will re-synchronize
the sequence values for all sequences in the updated list from the
publisher, following the steps outlined in the subscription creation
process.

5) Incorporate the pg_sequence_state function to fetch the sequence
value from the publisher, along with the page LSN. Incorporate
SetSequence function, which will procure a new relfilenode for the
sequence and set the new relfilenode with the specified value. This
will facilitate rollback in case of any failures.

Thoughts?

Regards,
Vignesh



Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Tue, Jun 11, 2024 at 7:36 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 11 Jun 2024 at 12:38, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jun 11, 2024 at 12:25 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Mon, 10 Jun 2024 at 14:48, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > >
> > > > > > > Are you imagining the behavior for sequences associated with tables
> > > > > > > differently than the ones defined by the CREATE SEQUENCE .. command? I
> > > > > > > was thinking that users would associate sequences with publications
> > > > > > > similar to what we do for tables for both cases. For example, they
> > > > > > > need to explicitly mention the sequences they want to replicate by
> > > > > > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE
> > > > > > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR
> > > > > > > SEQUENCES IN SCHEMA sch1;
> > > > > > >
> > > > > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1
> > > > > > > should copy both the explicitly defined sequences and sequences
> > > > > > > defined with the tables. Do you think a different variant for just
> > > > > > > copying sequences implicitly associated with tables (say for identity
> > > > > > > columns)?
> > > > > >
> > > > > > Oh, I was thinking that your proposal was to copy literally all
> > > > > > sequences by REPLICA/REFRESH SEQUENCE command.
> > > > > >
> > > >
> > > > I am trying to keep the behavior as close to tables as possible.
> > > >
> > > > > > But it seems to make
> > > > > > sense to explicitly specify the sequences they want to replicate. It
> > > > > > also means that they can create a publication that has only sequences.
> > > > > > In this case, even if they create a subscription for that publication,
> > > > > > we don't launch any apply workers for that subscription. Right?
> > > > > >
> > > >
> > > > Right, good point. I had not thought about this.
> > > >
> > > > > > Also, given that the main use case (at least as the first step) is
> > > > > > version upgrade, do we really need to support SEQUENCES IN SCHEMA and
> > > > > > even FOR SEQUENCE?
> > > > >
> > > >
> > > > At the very least, we can split the patch to move these variants to a
> > > > separate patch. Once the main patch is finalized, we can try to
> > > > evaluate the remaining separately.
> > >
> > > I engaged in an offline discussion with Amit about strategizing the
> > > division of patches to facilitate the review process. We agreed on the
> > > following split: The first patch will encompass the setting and
> > > getting of sequence values (core sequence changes). The second patch
> > > will cover all changes on the publisher side related to "FOR ALL
> > > SEQUENCES." The third patch will address subscriber side changes aimed
> > > at synchronizing "FOR ALL SEQUENCES" publications. The fourth patch
> > > will focus on supporting "FOR SEQUENCE" publication. Lastly, the fifth
> > > patch will introduce support for  "FOR ALL SEQUENCES IN SCHEMA"
> > > publication.
> > >
> > > I will work on this and share an updated patch for the same soon.
> >
> > +1. Sounds like a good plan.
>
> Amit and I engaged in an offline discussion regarding the design and
> contemplated that it could be like below:
> 1) CREATE PUBLICATION syntax enhancement:
> CREATE PUBLICATION ... FOR ALL SEQUENCES;
> The addition of a new column titled "all sequences" in the
> pg_publication system table will signify whether the publication is
> designated as all sequences publication or not.
>

The first approach sounds like we don't create entries for sequences
in pg_subscription_rel. In this case, how do we know all sequences
that we need to refresh when executing the REFRESH PUBLICATION
SEQUENCES command you mentioned below?

> 2)  CREATE SUBSCRIPTION -- no syntax change.
> Upon creation of a subscription, the following additional steps will
> be managed by the subscriber:
> i) The subscriber will retrieve the list of sequences associated with
> the subscription's publications.
> ii) For each sequence: a) Retrieve the sequence value from the
> publisher by invoking the pg_sequence_state function. b) Set the
> sequence with the value obtained from the publisher. iv) Once the
> subscription creation is completed, all sequence values will become
> visible at the subscriber's end.

Sequence values are always copied from the publisher? or does it
happen only when copy_data = true?

>
> An alternative design approach could involve retrieving the sequence
> list from the publisher during subscription creation and inserting the
> sequences with an "init" state into the pg_subscription_rel system
> table. These tasks could be executed by a single sequence sync worker,
> which would:
> i) Retrieve the list of sequences in the "init" state from the
> pg_subscription_rel system table.
> ii) Initiate a transaction.
> iii) For each sequence: a) Obtain the sequence value from the
> publisher by utilizing the pg_sequence_state function. b) Update the
> sequence with the value obtained from the publisher.
> iv) Commit the transaction.
>
> The benefit with the second approach is that if there are large number
> of sequences, the sequence sync can be enhanced to happen in parallel
> and also if there are any locks held on the sequences in the
> publisher, the sequence worker can wait to acquire the lock instead of
> blocking the whole create subscription command which will delay the
> initial copy of the tables too.

I prefer to have separate workers to sync sequences. Probably we can
start with a single worker and extend it to have multiple workers. BTW
the sequence-sync worker will be taken from
max_sync_workers_per_subscription pool?

Or yet another idea I came up with is that a tablesync worker will
synchronize both the table and sequences owned by the table. That is,
after the tablesync worker caught up with the apply worker, the
tablesync worker synchronizes sequences associated with the target
table as well. One benefit would be that at the time of initial table
sync being completed, the table and its sequence data are consistent.
As soon as new changes come to the table, it would become inconsistent
so it might not be helpful much, though. Also, sequences that are not
owned by any table will still need to be synchronized by someone.

>
> 3) Refreshing the sequence can be achieved through the existing
> command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> here).
> The subscriber identifies stale sequences, meaning sequences present
> in pg_subscription_rel but absent from the publication, and removes
> them from the pg_subscription_rel system table. The subscriber also
> checks for newly added sequences in the publisher and synchronizes
> their values from the publisher using the steps outlined in the
> subscription creation process. It's worth noting that previously
> synchronized sequences won't be synchronized again; the sequence sync
> will occur solely for the newly added sequences.
>
> 4) Introducing a new command for refreshing all sequences: ALTER
> SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> The subscriber will remove stale sequences and add newly added
> sequences from the publisher. Following this, it will re-synchronize
> the sequence values for all sequences in the updated list from the
> publisher, following the steps outlined in the subscription creation
> process.

The difference between 3) and 4) is whether or not to re-synchronize
the previously synchronized sequences. Do we really want to introduce
a new command for 4)? I felt that we can invent an option say
copy_all_sequence for the REFRESH PUBLICATION command to cover the 4)
case.

>
> 5) Incorporate the pg_sequence_state function to fetch the sequence
> value from the publisher, along with the page LSN. Incorporate
> SetSequence function, which will procure a new relfilenode for the
> sequence and set the new relfilenode with the specified value. This
> will facilitate rollback in case of any failures.

Does it mean that we create a new relfilenode for every update of the value?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Dilip Kumar
Дата:
On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21@gmail.com> wrote:
>
> Amit and I engaged in an offline discussion regarding the design and
> contemplated that it could be like below:

If I understand correctly, does this require the sequences to already
exist on the subscribing node before creating the subscription, or
will it also copy any non-existing sequences?

> 1) CREATE PUBLICATION syntax enhancement:
> CREATE PUBLICATION ... FOR ALL SEQUENCES;
> The addition of a new column titled "all sequences" in the
> pg_publication system table will signify whether the publication is
> designated as all sequences publication or not.
>
> 2)  CREATE SUBSCRIPTION -- no syntax change.
> Upon creation of a subscription, the following additional steps will
> be managed by the subscriber:
> i) The subscriber will retrieve the list of sequences associated with
> the subscription's publications.
> ii) For each sequence: a) Retrieve the sequence value from the
> publisher by invoking the pg_sequence_state function. b) Set the
> sequence with the value obtained from the publisher. iv) Once the
> subscription creation is completed, all sequence values will become
> visible at the subscriber's end.
>
> An alternative design approach could involve retrieving the sequence
> list from the publisher during subscription creation and inserting the
> sequences with an "init" state into the pg_subscription_rel system
> table. These tasks could be executed by a single sequence sync worker,
> which would:
> i) Retrieve the list of sequences in the "init" state from the
> pg_subscription_rel system table.
> ii) Initiate a transaction.
> iii) For each sequence: a) Obtain the sequence value from the
> publisher by utilizing the pg_sequence_state function. b) Update the
> sequence with the value obtained from the publisher.
> iv) Commit the transaction.
>
> The benefit with the second approach is that if there are large number
> of sequences, the sequence sync can be enhanced to happen in parallel
> and also if there are any locks held on the sequences in the
> publisher, the sequence worker can wait to acquire the lock instead of
> blocking the whole create subscription command which will delay the
> initial copy of the tables too.

Yeah w.r.t. this point second approach seems better.

> 3) Refreshing the sequence can be achieved through the existing
> command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> here).
> The subscriber identifies stale sequences, meaning sequences present
> in pg_subscription_rel but absent from the publication, and removes
> them from the pg_subscription_rel system table. The subscriber also
> checks for newly added sequences in the publisher and synchronizes
> their values from the publisher using the steps outlined in the
> subscription creation process. It's worth noting that previously
> synchronized sequences won't be synchronized again; the sequence sync
> will occur solely for the newly added sequences.

> 4) Introducing a new command for refreshing all sequences: ALTER
> SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> The subscriber will remove stale sequences and add newly added
> sequences from the publisher. Following this, it will re-synchronize
> the sequence values for all sequences in the updated list from the
> publisher, following the steps outlined in the subscription creation
> process.

Okay, this answers my first question: we will remove the sequences
that are removed from the publisher and add the new sequences. I don't
see any problem with this, but doesn't it seem like we are effectively
doing DDL replication only for sequences without having a
comprehensive plan for overall DDL replication?

> 5) Incorporate the pg_sequence_state function to fetch the sequence
> value from the publisher, along with the page LSN. Incorporate
> SetSequence function, which will procure a new relfilenode for the
> sequence and set the new relfilenode with the specified value. This
> will facilitate rollback in case of any failures.

I do not understand this point, you mean whenever we are fetching the
sequence value from the publisher we need to create a new relfilenode
on the subscriber?  Why not just update the catalog tuple is
sufficient?  Or this is for handling the ALTER SEQUENCE case?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Wed, Jun 12, 2024 at 10:44 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jun 11, 2024 at 7:36 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > 1) CREATE PUBLICATION syntax enhancement:
> > CREATE PUBLICATION ... FOR ALL SEQUENCES;
> > The addition of a new column titled "all sequences" in the
> > pg_publication system table will signify whether the publication is
> > designated as all sequences publication or not.
> >
>
> The first approach sounds like we don't create entries for sequences
> in pg_subscription_rel. In this case, how do we know all sequences
> that we need to refresh when executing the REFRESH PUBLICATION
> SEQUENCES command you mentioned below?
>

As per my understanding, we should be creating entries for sequences
in pg_subscription_rel similar to tables. The difference would be that
we won't need all the sync_states (i = initialize, d = data is being
copied, f = finished table copy, s = synchronized, r = ready) as we
don't need any synchronization with apply workers.

> > 2)  CREATE SUBSCRIPTION -- no syntax change.
> > Upon creation of a subscription, the following additional steps will
> > be managed by the subscriber:
> > i) The subscriber will retrieve the list of sequences associated with
> > the subscription's publications.
> > ii) For each sequence: a) Retrieve the sequence value from the
> > publisher by invoking the pg_sequence_state function. b) Set the
> > sequence with the value obtained from the publisher. iv) Once the
> > subscription creation is completed, all sequence values will become
> > visible at the subscriber's end.
>
> Sequence values are always copied from the publisher? or does it
> happen only when copy_data = true?
>

It is better to do it when "copy_data = true" to keep it compatible
with the table's behavior.

> >
> > An alternative design approach could involve retrieving the sequence
> > list from the publisher during subscription creation and inserting the
> > sequences with an "init" state into the pg_subscription_rel system
> > table. These tasks could be executed by a single sequence sync worker,
> > which would:
> > i) Retrieve the list of sequences in the "init" state from the
> > pg_subscription_rel system table.
> > ii) Initiate a transaction.
> > iii) For each sequence: a) Obtain the sequence value from the
> > publisher by utilizing the pg_sequence_state function. b) Update the
> > sequence with the value obtained from the publisher.
> > iv) Commit the transaction.
> >
> > The benefit with the second approach is that if there are large number
> > of sequences, the sequence sync can be enhanced to happen in parallel
> > and also if there are any locks held on the sequences in the
> > publisher, the sequence worker can wait to acquire the lock instead of
> > blocking the whole create subscription command which will delay the
> > initial copy of the tables too.
>
> I prefer to have separate workers to sync sequences.
>

+1.

> Probably we can
> start with a single worker and extend it to have multiple workers.

Yeah, starting with a single worker sounds good for now. Do you think
we should sync all the sequences in a single transaction or have some
threshold value above which a different transaction would be required
or maybe a different sequence sync worker altogether? Now, having
multiple sequence-sync workers requires some synchronization so that
only a single worker is allocated for one sequence.

The simplest thing is to use a single sequence sync worker that syncs
all sequences in one transaction but with a large number of sequences,
it could be inefficient. OTOH, I am not sure if it would be a problem
in reality.

>
> BTW
> the sequence-sync worker will be taken from
> max_sync_workers_per_subscription pool?
>

I think so.

> Or yet another idea I came up with is that a tablesync worker will
> synchronize both the table and sequences owned by the table. That is,
> after the tablesync worker caught up with the apply worker, the
> tablesync worker synchronizes sequences associated with the target
> table as well. One benefit would be that at the time of initial table
> sync being completed, the table and its sequence data are consistent.
> As soon as new changes come to the table, it would become inconsistent
> so it might not be helpful much, though. Also, sequences that are not
> owned by any table will still need to be synchronized by someone.
>

The other thing to consider in this idea is that we somehow need to
distinguish the sequences owned by the table.

> >
> > 3) Refreshing the sequence can be achieved through the existing
> > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > here).
> > The subscriber identifies stale sequences, meaning sequences present
> > in pg_subscription_rel but absent from the publication, and removes
> > them from the pg_subscription_rel system table. The subscriber also
> > checks for newly added sequences in the publisher and synchronizes
> > their values from the publisher using the steps outlined in the
> > subscription creation process. It's worth noting that previously
> > synchronized sequences won't be synchronized again; the sequence sync
> > will occur solely for the newly added sequences.
> >
> > 4) Introducing a new command for refreshing all sequences: ALTER
> > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > The subscriber will remove stale sequences and add newly added
> > sequences from the publisher. Following this, it will re-synchronize
> > the sequence values for all sequences in the updated list from the
> > publisher, following the steps outlined in the subscription creation
> > process.
>
> The difference between 3) and 4) is whether or not to re-synchronize
> the previously synchronized sequences. Do we really want to introduce
> a new command for 4)? I felt that we can invent an option say
> copy_all_sequence for the REFRESH PUBLICATION command to cover the 4)
> case.
>

Yeah, that is also an option but it could confuse along with copy_data
option. Say the user has selected copy_data = false but
copy_all_sequences = true then the first option indicates to *not*
copy the data of table and sequences and the second option indicates
to copy the sequences data which sounds contradictory. The other idea
is to have an option copy_existing_sequences (which indicates to copy
existing sequence values) but that also has somewhat the same drawback
as copy_all_sequences but to a lesser degree.

> >
> > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > value from the publisher, along with the page LSN. Incorporate
> > SetSequence function, which will procure a new relfilenode for the
> > sequence and set the new relfilenode with the specified value. This
> > will facilitate rollback in case of any failures.
>
> Does it mean that we create a new relfilenode for every update of the value?
>

We need it for initial sync so that if there is an error both the
sequence state in pg_subscription_rel and sequence values can be
rolled back together. However, it is unclear whether we need to create
a new relfilenode while copying existing sequences (say during ALTER
SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we
decide)? Probably the answer lies in how we want to implement this
command. If we want to copy all sequence values during the command
itself then it is probably okay but if we want to handover this task
to the sequence-sync worker then we need some state management and a
new relfilenode so that on error both state and sequence values are
rolled back.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Wed, 12 Jun 2024 at 10:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > Amit and I engaged in an offline discussion regarding the design and
> > contemplated that it could be like below:
>
> If I understand correctly, does this require the sequences to already
> exist on the subscribing node before creating the subscription, or
> will it also copy any non-existing sequences?

Sequences must exist in the subscriber; we'll synchronize only their
values. Any sequences that are not present in the subscriber will
trigger an error.

> > 1) CREATE PUBLICATION syntax enhancement:
> > CREATE PUBLICATION ... FOR ALL SEQUENCES;
> > The addition of a new column titled "all sequences" in the
> > pg_publication system table will signify whether the publication is
> > designated as all sequences publication or not.
> >
> > 2)  CREATE SUBSCRIPTION -- no syntax change.
> > Upon creation of a subscription, the following additional steps will
> > be managed by the subscriber:
> > i) The subscriber will retrieve the list of sequences associated with
> > the subscription's publications.
> > ii) For each sequence: a) Retrieve the sequence value from the
> > publisher by invoking the pg_sequence_state function. b) Set the
> > sequence with the value obtained from the publisher. iv) Once the
> > subscription creation is completed, all sequence values will become
> > visible at the subscriber's end.
> >
> > An alternative design approach could involve retrieving the sequence
> > list from the publisher during subscription creation and inserting the
> > sequences with an "init" state into the pg_subscription_rel system
> > table. These tasks could be executed by a single sequence sync worker,
> > which would:
> > i) Retrieve the list of sequences in the "init" state from the
> > pg_subscription_rel system table.
> > ii) Initiate a transaction.
> > iii) For each sequence: a) Obtain the sequence value from the
> > publisher by utilizing the pg_sequence_state function. b) Update the
> > sequence with the value obtained from the publisher.
> > iv) Commit the transaction.
> >
> > The benefit with the second approach is that if there are large number
> > of sequences, the sequence sync can be enhanced to happen in parallel
> > and also if there are any locks held on the sequences in the
> > publisher, the sequence worker can wait to acquire the lock instead of
> > blocking the whole create subscription command which will delay the
> > initial copy of the tables too.
>
> Yeah w.r.t. this point second approach seems better.

ok

> > 3) Refreshing the sequence can be achieved through the existing
> > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > here).
> > The subscriber identifies stale sequences, meaning sequences present
> > in pg_subscription_rel but absent from the publication, and removes
> > them from the pg_subscription_rel system table. The subscriber also
> > checks for newly added sequences in the publisher and synchronizes
> > their values from the publisher using the steps outlined in the
> > subscription creation process. It's worth noting that previously
> > synchronized sequences won't be synchronized again; the sequence sync
> > will occur solely for the newly added sequences.
>
> > 4) Introducing a new command for refreshing all sequences: ALTER
> > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > The subscriber will remove stale sequences and add newly added
> > sequences from the publisher. Following this, it will re-synchronize
> > the sequence values for all sequences in the updated list from the
> > publisher, following the steps outlined in the subscription creation
> > process.
>
> Okay, this answers my first question: we will remove the sequences
> that are removed from the publisher and add the new sequences. I don't
> see any problem with this, but doesn't it seem like we are effectively
> doing DDL replication only for sequences without having a
> comprehensive plan for overall DDL replication?

What I intended to convey is that we'll eliminate the sequences from
pg_subscription_rel. We won't facilitate the DDL replication of
sequences; instead, we anticipate users to create the sequences
themselves.

> > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > value from the publisher, along with the page LSN. Incorporate
> > SetSequence function, which will procure a new relfilenode for the
> > sequence and set the new relfilenode with the specified value. This
> > will facilitate rollback in case of any failures.
>
> I do not understand this point, you mean whenever we are fetching the
> sequence value from the publisher we need to create a new relfilenode
> on the subscriber?  Why not just update the catalog tuple is
> sufficient?  Or this is for handling the ALTER SEQUENCE case?

Sequences operate distinctively from tables. Alterations to sequences
reflect instantly in another session, even before committing the
transaction. To ensure the synchronization of sequence value and state
updates in pg_subscription_rel, we assign it a new relfilenode. This
strategy ensures that any potential errors allow for the rollback of
both the sequence state in pg_subscription_rel and the sequence values
simultaneously.

Regards,
Vignesh



Re: Logical Replication of sequences

От
Dilip Kumar
Дата:
On Wed, Jun 12, 2024 at 4:08 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 12 Jun 2024 at 10:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > Amit and I engaged in an offline discussion regarding the design and
> > > contemplated that it could be like below:
> >
> > If I understand correctly, does this require the sequences to already
> > exist on the subscribing node before creating the subscription, or
> > will it also copy any non-existing sequences?
>
> Sequences must exist in the subscriber; we'll synchronize only their
> values. Any sequences that are not present in the subscriber will
> trigger an error.

Okay, that makes sense.

>
> > > 3) Refreshing the sequence can be achieved through the existing
> > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > > here).
> > > The subscriber identifies stale sequences, meaning sequences present
> > > in pg_subscription_rel but absent from the publication, and removes
> > > them from the pg_subscription_rel system table. The subscriber also
> > > checks for newly added sequences in the publisher and synchronizes
> > > their values from the publisher using the steps outlined in the
> > > subscription creation process. It's worth noting that previously
> > > synchronized sequences won't be synchronized again; the sequence sync
> > > will occur solely for the newly added sequences.
> >
> > > 4) Introducing a new command for refreshing all sequences: ALTER
> > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > > The subscriber will remove stale sequences and add newly added
> > > sequences from the publisher. Following this, it will re-synchronize
> > > the sequence values for all sequences in the updated list from the
> > > publisher, following the steps outlined in the subscription creation
> > > process.
> >
> > Okay, this answers my first question: we will remove the sequences
> > that are removed from the publisher and add the new sequences. I don't
> > see any problem with this, but doesn't it seem like we are effectively
> > doing DDL replication only for sequences without having a
> > comprehensive plan for overall DDL replication?
>
> What I intended to convey is that we'll eliminate the sequences from
> pg_subscription_rel. We won't facilitate the DDL replication of
> sequences; instead, we anticipate users to create the sequences
> themselves.

hmm okay.

> > > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > > value from the publisher, along with the page LSN. Incorporate
> > > SetSequence function, which will procure a new relfilenode for the
> > > sequence and set the new relfilenode with the specified value. This
> > > will facilitate rollback in case of any failures.
> >
> > I do not understand this point, you mean whenever we are fetching the
> > sequence value from the publisher we need to create a new relfilenode
> > on the subscriber?  Why not just update the catalog tuple is
> > sufficient?  Or this is for handling the ALTER SEQUENCE case?
>
> Sequences operate distinctively from tables. Alterations to sequences
> reflect instantly in another session, even before committing the
> transaction. To ensure the synchronization of sequence value and state
> updates in pg_subscription_rel, we assign it a new relfilenode. This
> strategy ensures that any potential errors allow for the rollback of
> both the sequence state in pg_subscription_rel and the sequence values
> simultaneously.

So, you're saying that when we synchronize the sequence values on the
subscriber side, we will create a new relfilenode to allow reverting
to the old state of the sequence in case of an error or transaction
rollback? But why would we want to do that? Generally, even if you
call nextval() on a sequence and then roll back the transaction, the
sequence value doesn't revert to the old value. So, what specific
problem on the subscriber side are we trying to avoid by operating on
a new relfilenode?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Wed, 12 Jun 2024 at 17:09, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Wed, Jun 12, 2024 at 4:08 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 12 Jun 2024 at 10:51, Dilip Kumar <dilipbalaut@gmail.com> wrote:
> > >
> > > On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21@gmail.com> wrote:
> > > >
> > > > Amit and I engaged in an offline discussion regarding the design and
> > > > contemplated that it could be like below:
> > >
> > > If I understand correctly, does this require the sequences to already
> > > exist on the subscribing node before creating the subscription, or
> > > will it also copy any non-existing sequences?
> >
> > Sequences must exist in the subscriber; we'll synchronize only their
> > values. Any sequences that are not present in the subscriber will
> > trigger an error.
>
> Okay, that makes sense.
>
> >
> > > > 3) Refreshing the sequence can be achieved through the existing
> > > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > > > here).
> > > > The subscriber identifies stale sequences, meaning sequences present
> > > > in pg_subscription_rel but absent from the publication, and removes
> > > > them from the pg_subscription_rel system table. The subscriber also
> > > > checks for newly added sequences in the publisher and synchronizes
> > > > their values from the publisher using the steps outlined in the
> > > > subscription creation process. It's worth noting that previously
> > > > synchronized sequences won't be synchronized again; the sequence sync
> > > > will occur solely for the newly added sequences.
> > >
> > > > 4) Introducing a new command for refreshing all sequences: ALTER
> > > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > > > The subscriber will remove stale sequences and add newly added
> > > > sequences from the publisher. Following this, it will re-synchronize
> > > > the sequence values for all sequences in the updated list from the
> > > > publisher, following the steps outlined in the subscription creation
> > > > process.
> > >
> > > Okay, this answers my first question: we will remove the sequences
> > > that are removed from the publisher and add the new sequences. I don't
> > > see any problem with this, but doesn't it seem like we are effectively
> > > doing DDL replication only for sequences without having a
> > > comprehensive plan for overall DDL replication?
> >
> > What I intended to convey is that we'll eliminate the sequences from
> > pg_subscription_rel. We won't facilitate the DDL replication of
> > sequences; instead, we anticipate users to create the sequences
> > themselves.
>
> hmm okay.
>
> > > > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > > > value from the publisher, along with the page LSN. Incorporate
> > > > SetSequence function, which will procure a new relfilenode for the
> > > > sequence and set the new relfilenode with the specified value. This
> > > > will facilitate rollback in case of any failures.
> > >
> > > I do not understand this point, you mean whenever we are fetching the
> > > sequence value from the publisher we need to create a new relfilenode
> > > on the subscriber?  Why not just update the catalog tuple is
> > > sufficient?  Or this is for handling the ALTER SEQUENCE case?
> >
> > Sequences operate distinctively from tables. Alterations to sequences
> > reflect instantly in another session, even before committing the
> > transaction. To ensure the synchronization of sequence value and state
> > updates in pg_subscription_rel, we assign it a new relfilenode. This
> > strategy ensures that any potential errors allow for the rollback of
> > both the sequence state in pg_subscription_rel and the sequence values
> > simultaneously.
>
> So, you're saying that when we synchronize the sequence values on the
> subscriber side, we will create a new relfilenode to allow reverting
> to the old state of the sequence in case of an error or transaction
> rollback? But why would we want to do that? Generally, even if you
> call nextval() on a sequence and then roll back the transaction, the
> sequence value doesn't revert to the old value. So, what specific
> problem on the subscriber side are we trying to avoid by operating on
> a new relfilenode?

Let's consider a situation where we have two sequences: seq1 with a
value of 100 and seq2 with a value of 200. Now, let's say seq1 is
synced and updated to 100, then we attempt to synchronize seq2,
there's a failure due to the sequence not existing or encountering
some other issue. In this scenario, we don't want to halt operations
where seq1 is synchronized, but the sequence state for sequence isn't
changed to "ready" in pg_subscription_rel.
Updating the sequence data directly reflects the sequence change
immediately. However, if we assign a new relfile node for the sequence
and update the sequence value for the new relfile node, until the
transaction is committed, other concurrent users will still be
utilizing the old relfile node for the sequence, and only the old data
will be visible. Once all sequences are synchronized, and the sequence
state is updated in pg_subscription_rel, the transaction will either
be committed or aborted. If committed, users will be able to observe
the new sequence values because the sequences will be updated with the
new relfile node containing the updated sequence value.

Regards,
Vignesh



Re: Logical Replication of sequences

От
Dilip Kumar
Дата:
On Thu, Jun 13, 2024 at 10:10 AM vignesh C <vignesh21@gmail.com> wrote:
>
> > So, you're saying that when we synchronize the sequence values on the
> > subscriber side, we will create a new relfilenode to allow reverting
> > to the old state of the sequence in case of an error or transaction
> > rollback? But why would we want to do that? Generally, even if you
> > call nextval() on a sequence and then roll back the transaction, the
> > sequence value doesn't revert to the old value. So, what specific
> > problem on the subscriber side are we trying to avoid by operating on
> > a new relfilenode?
>
> Let's consider a situation where we have two sequences: seq1 with a
> value of 100 and seq2 with a value of 200. Now, let's say seq1 is
> synced and updated to 100, then we attempt to synchronize seq2,
> there's a failure due to the sequence not existing or encountering
> some other issue. In this scenario, we don't want to halt operations
> where seq1 is synchronized, but the sequence state for sequence isn't
> changed to "ready" in pg_subscription_rel.

Thanks for the explanation, but I am still not getting it completely,
do you mean to say unless all the sequences are not synced any of the
sequences would not be marked "ready" in pg_subscription_rel? Is that
necessary? I mean why we can not sync the sequences one by one and
mark them ready?  Why it is necessary to either have all the sequences
synced or none of them?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Thu, 13 Jun 2024 at 10:27, Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> On Thu, Jun 13, 2024 at 10:10 AM vignesh C <vignesh21@gmail.com> wrote:
> >
> > > So, you're saying that when we synchronize the sequence values on the
> > > subscriber side, we will create a new relfilenode to allow reverting
> > > to the old state of the sequence in case of an error or transaction
> > > rollback? But why would we want to do that? Generally, even if you
> > > call nextval() on a sequence and then roll back the transaction, the
> > > sequence value doesn't revert to the old value. So, what specific
> > > problem on the subscriber side are we trying to avoid by operating on
> > > a new relfilenode?
> >
> > Let's consider a situation where we have two sequences: seq1 with a
> > value of 100 and seq2 with a value of 200. Now, let's say seq1 is
> > synced and updated to 100, then we attempt to synchronize seq2,
> > there's a failure due to the sequence not existing or encountering
> > some other issue. In this scenario, we don't want to halt operations
> > where seq1 is synchronized, but the sequence state for sequence isn't
> > changed to "ready" in pg_subscription_rel.
>
> Thanks for the explanation, but I am still not getting it completely,
> do you mean to say unless all the sequences are not synced any of the
> sequences would not be marked "ready" in pg_subscription_rel? Is that
> necessary? I mean why we can not sync the sequences one by one and
> mark them ready?  Why it is necessary to either have all the sequences
> synced or none of them?

Since updating the sequence is one operation and setting
pg_subscription_rel is another, I was trying to avoid a situation
where the sequence is updated but its state is not reflected in
pg_subscription_rel. It seems you are suggesting that it's acceptable
for the sequence to be updated even if its state isn't updated in
pg_subscription_rel, and in such cases, the sequence value does not
need to be reverted.

Regards,
Vignesh



Re: Logical Replication of sequences

От
Dilip Kumar
Дата:
On Thu, Jun 13, 2024 at 11:53 AM vignesh C <vignesh21@gmail.com> wrote:
>
> On Thu, 13 Jun 2024 at 10:27, Dilip Kumar <dilipbalaut@gmail.com> wrote:

> > Thanks for the explanation, but I am still not getting it completely,
> > do you mean to say unless all the sequences are not synced any of the
> > sequences would not be marked "ready" in pg_subscription_rel? Is that
> > necessary? I mean why we can not sync the sequences one by one and
> > mark them ready?  Why it is necessary to either have all the sequences
> > synced or none of them?
>
> Since updating the sequence is one operation and setting
> pg_subscription_rel is another, I was trying to avoid a situation
> where the sequence is updated but its state is not reflected in
> pg_subscription_rel. It seems you are suggesting that it's acceptable
> for the sequence to be updated even if its state isn't updated in
> pg_subscription_rel, and in such cases, the sequence value does not
> need to be reverted.

Right, the complexity we're adding to achieve a behavior that may not
be truly desirable is a concern. For instance, if we mark the status
as ready but do not sync the sequences, it could lead to issues.
However, if we have synced some sequences but encounter a failure
without marking the status as ready, I don't consider it inconsistent
in any way.  But anyway, now I understand your thinking behind that so
it's a good idea to leave this design behavior for a later decision.
Gathering more opinions and insights during later stages will provide
a clearer perspective on how to proceed with this aspect.  Thanks.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 12, 2024 at 10:44 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jun 11, 2024 at 7:36 PM vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > 1) CREATE PUBLICATION syntax enhancement:
> > > CREATE PUBLICATION ... FOR ALL SEQUENCES;
> > > The addition of a new column titled "all sequences" in the
> > > pg_publication system table will signify whether the publication is
> > > designated as all sequences publication or not.
> > >
> >
> > The first approach sounds like we don't create entries for sequences
> > in pg_subscription_rel. In this case, how do we know all sequences
> > that we need to refresh when executing the REFRESH PUBLICATION
> > SEQUENCES command you mentioned below?
> >
>
> As per my understanding, we should be creating entries for sequences
> in pg_subscription_rel similar to tables. The difference would be that
> we won't need all the sync_states (i = initialize, d = data is being
> copied, f = finished table copy, s = synchronized, r = ready) as we
> don't need any synchronization with apply workers.

Agreed.

>
> > > 2)  CREATE SUBSCRIPTION -- no syntax change.
> > > Upon creation of a subscription, the following additional steps will
> > > be managed by the subscriber:
> > > i) The subscriber will retrieve the list of sequences associated with
> > > the subscription's publications.
> > > ii) For each sequence: a) Retrieve the sequence value from the
> > > publisher by invoking the pg_sequence_state function. b) Set the
> > > sequence with the value obtained from the publisher. iv) Once the
> > > subscription creation is completed, all sequence values will become
> > > visible at the subscriber's end.
> >
> > Sequence values are always copied from the publisher? or does it
> > happen only when copy_data = true?
> >
>
> It is better to do it when "copy_data = true" to keep it compatible
> with the table's behavior.

+1

>
> > Probably we can
> > start with a single worker and extend it to have multiple workers.
>
> Yeah, starting with a single worker sounds good for now. Do you think
> we should sync all the sequences in a single transaction or have some
> threshold value above which a different transaction would be required
> or maybe a different sequence sync worker altogether? Now, having
> multiple sequence-sync workers requires some synchronization so that
> only a single worker is allocated for one sequence.
>
> The simplest thing is to use a single sequence sync worker that syncs
> all sequences in one transaction but with a large number of sequences,
> it could be inefficient. OTOH, I am not sure if it would be a problem
> in reality.

I think that we can start with using a single worker and one
transaction, and measure the performance with a large number of
sequences.

> > Or yet another idea I came up with is that a tablesync worker will
> > synchronize both the table and sequences owned by the table. That is,
> > after the tablesync worker caught up with the apply worker, the
> > tablesync worker synchronizes sequences associated with the target
> > table as well. One benefit would be that at the time of initial table
> > sync being completed, the table and its sequence data are consistent.

Correction; it's not guaranteed that the sequence data and table data
are consistent even in this case since the tablesync worker could get
on-disk sequence data that might have already been updated.

> > As soon as new changes come to the table, it would become inconsistent
> > so it might not be helpful much, though. Also, sequences that are not
> > owned by any table will still need to be synchronized by someone.
> >
>
> The other thing to consider in this idea is that we somehow need to
> distinguish the sequences owned by the table.

I think we can check pg_depend. The owned sequences reference to the table.

>
> > >
> > > 3) Refreshing the sequence can be achieved through the existing
> > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > > here).
> > > The subscriber identifies stale sequences, meaning sequences present
> > > in pg_subscription_rel but absent from the publication, and removes
> > > them from the pg_subscription_rel system table. The subscriber also
> > > checks for newly added sequences in the publisher and synchronizes
> > > their values from the publisher using the steps outlined in the
> > > subscription creation process. It's worth noting that previously
> > > synchronized sequences won't be synchronized again; the sequence sync
> > > will occur solely for the newly added sequences.
> > >
> > > 4) Introducing a new command for refreshing all sequences: ALTER
> > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > > The subscriber will remove stale sequences and add newly added
> > > sequences from the publisher. Following this, it will re-synchronize
> > > the sequence values for all sequences in the updated list from the
> > > publisher, following the steps outlined in the subscription creation
> > > process.
> >
> > The difference between 3) and 4) is whether or not to re-synchronize
> > the previously synchronized sequences. Do we really want to introduce
> > a new command for 4)? I felt that we can invent an option say
> > copy_all_sequence for the REFRESH PUBLICATION command to cover the 4)
> > case.
> >
>
> Yeah, that is also an option but it could confuse along with copy_data
> option. Say the user has selected copy_data = false but
> copy_all_sequences = true then the first option indicates to *not*
> copy the data of table and sequences and the second option indicates
> to copy the sequences data which sounds contradictory. The other idea
> is to have an option copy_existing_sequences (which indicates to copy
> existing sequence values) but that also has somewhat the same drawback
> as copy_all_sequences but to a lesser degree.

Good point. And I understood that the REFRESH PUBLICATION SEQUENCES
command would be helpful when users want to synchronize sequences
between two nodes before upgrading.

>
> > >
> > > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > > value from the publisher, along with the page LSN. Incorporate
> > > SetSequence function, which will procure a new relfilenode for the
> > > sequence and set the new relfilenode with the specified value. This
> > > will facilitate rollback in case of any failures.
> >
> > Does it mean that we create a new relfilenode for every update of the value?
> >
>
> We need it for initial sync so that if there is an error both the
> sequence state in pg_subscription_rel and sequence values can be
> rolled back together.

Agreed.

> However, it is unclear whether we need to create
> a new relfilenode while copying existing sequences (say during ALTER
> SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we
> decide)? Probably the answer lies in how we want to implement this
> command. If we want to copy all sequence values during the command
> itself then it is probably okay but if we want to handover this task
> to the sequence-sync worker then we need some state management and a
> new relfilenode so that on error both state and sequence values are
> rolled back.

What state transition of pg_subscription_rel entries for sequences do
we need while copying sequences values? For example, we insert an
entry with 'init' state at CREATE SUBSCRIPTION and then the
sequence-sync worker updates to 'ready' and copies the sequence data.
And at REFRESH PUBLICATION SEQUENCES, we update the state back to
'init' again so that the sequence-sync worker can process it? Given
REFRESH PUBLICATION SEQUENCES won't be executed very frequently, it
might be acceptable to transactionally update sequence values.


Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Yeah, starting with a single worker sounds good for now. Do you think
> > we should sync all the sequences in a single transaction or have some
> > threshold value above which a different transaction would be required
> > or maybe a different sequence sync worker altogether? Now, having
> > multiple sequence-sync workers requires some synchronization so that
> > only a single worker is allocated for one sequence.
> >
> > The simplest thing is to use a single sequence sync worker that syncs
> > all sequences in one transaction but with a large number of sequences,
> > it could be inefficient. OTOH, I am not sure if it would be a problem
> > in reality.
>
> I think that we can start with using a single worker and one
> transaction, and measure the performance with a large number of
> sequences.
>

Fair enough. However, this raises the question Dilip and Vignesh are
discussing whether we need a new relfilenode for sequence update even
during initial sync? As per my understanding, the idea is that similar
to tables, the CREATE SUBSCRIPTION command (with copy_data = true)
will create the new sequence entries in pg_subscription_rel with the
state as 'i'. Then the sequence-sync worker would start a transaction
and one-by-one copy the latest sequence values for each sequence (that
has state as 'i' in pg_subscription_rel) and mark its state as ready
'r' and commit the transaction. Now if there is an error during this
operation it will restart the entire operation. The idea of creating a
new relfilenode is to handle the error so that if there is a rollback,
the sequence state will be rolled back to 'i' and the sequence value
will also be rolled back. The other option could be that we update the
sequence value without a new relfilenode and if the transaction rolled
back then only the sequence's state will be rolled back to 'i'. This
would work with a minor inconsistency that sequence values will be
up-to-date even when the sequence state is 'i' in pg_subscription_rel.
I am not sure if that matters because anyway, they can quickly be
out-of-sync with the publisher again.

Now, say we don't want to maintain the state of sequences for initial
sync at all then after the error how will we detect if there are any
pending sequences to be synced? One possibility is that we maintain a
subscription level flag 'subsequencesync' in 'pg_subscription' to
indicate whether sequences need sync. This flag would indicate whether
to sync all the sequences in pg_susbcription_rel. This would mean that
if there is an error while syncing the sequences we will resync all
the sequences again. This could be acceptable considering the chances
of error during sequence sync are low. The benefit is that both the
REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
idea and sync all sequences without needing a new relfilenode. Users
can always refer 'subsequencesync' flag in 'pg_subscription' to see if
all the sequences are synced after executing the command.

> > > Or yet another idea I came up with is that a tablesync worker will
> > > synchronize both the table and sequences owned by the table. That is,
> > > after the tablesync worker caught up with the apply worker, the
> > > tablesync worker synchronizes sequences associated with the target
> > > table as well. One benefit would be that at the time of initial table
> > > sync being completed, the table and its sequence data are consistent.
>
> Correction; it's not guaranteed that the sequence data and table data
> are consistent even in this case since the tablesync worker could get
> on-disk sequence data that might have already been updated.
>

The benefit of this approach is not clear to me. Our aim is to sync
all sequences before the upgrade, so not sure if this helps because
anyway both table values and corresponding sequences can again be
out-of-sync very quickly.

> >
> > > >
> > > > 3) Refreshing the sequence can be achieved through the existing
> > > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change
> > > > here).
> > > > The subscriber identifies stale sequences, meaning sequences present
> > > > in pg_subscription_rel but absent from the publication, and removes
> > > > them from the pg_subscription_rel system table. The subscriber also
> > > > checks for newly added sequences in the publisher and synchronizes
> > > > their values from the publisher using the steps outlined in the
> > > > subscription creation process. It's worth noting that previously
> > > > synchronized sequences won't be synchronized again; the sequence sync
> > > > will occur solely for the newly added sequences.
> > > >
> > > > 4) Introducing a new command for refreshing all sequences: ALTER
> > > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES.
> > > > The subscriber will remove stale sequences and add newly added
> > > > sequences from the publisher. Following this, it will re-synchronize
> > > > the sequence values for all sequences in the updated list from the
> > > > publisher, following the steps outlined in the subscription creation
> > > > process.
> > >
> > > The difference between 3) and 4) is whether or not to re-synchronize
> > > the previously synchronized sequences. Do we really want to introduce
> > > a new command for 4)? I felt that we can invent an option say
> > > copy_all_sequence for the REFRESH PUBLICATION command to cover the 4)
> > > case.
> > >
> >
> > Yeah, that is also an option but it could confuse along with copy_data
> > option. Say the user has selected copy_data = false but
> > copy_all_sequences = true then the first option indicates to *not*
> > copy the data of table and sequences and the second option indicates
> > to copy the sequences data which sounds contradictory. The other idea
> > is to have an option copy_existing_sequences (which indicates to copy
> > existing sequence values) but that also has somewhat the same drawback
> > as copy_all_sequences but to a lesser degree.
>
> Good point. And I understood that the REFRESH PUBLICATION SEQUENCES
> command would be helpful when users want to synchronize sequences
> between two nodes before upgrading.
>

Right.

> >
> > > >
> > > > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > > > value from the publisher, along with the page LSN. Incorporate
> > > > SetSequence function, which will procure a new relfilenode for the
> > > > sequence and set the new relfilenode with the specified value. This
> > > > will facilitate rollback in case of any failures.
> > >
> > > Does it mean that we create a new relfilenode for every update of the value?
> > >
> >
> > We need it for initial sync so that if there is an error both the
> > sequence state in pg_subscription_rel and sequence values can be
> > rolled back together.
>
> Agreed.
>
> > However, it is unclear whether we need to create
> > a new relfilenode while copying existing sequences (say during ALTER
> > SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we
> > decide)? Probably the answer lies in how we want to implement this
> > command. If we want to copy all sequence values during the command
> > itself then it is probably okay but if we want to handover this task
> > to the sequence-sync worker then we need some state management and a
> > new relfilenode so that on error both state and sequence values are
> > rolled back.
>
> What state transition of pg_subscription_rel entries for sequences do
> we need while copying sequences values? For example, we insert an
> entry with 'init' state at CREATE SUBSCRIPTION and then the
> sequence-sync worker updates to 'ready' and copies the sequence data.
> And at REFRESH PUBLICATION SEQUENCES, we update the state back to
> 'init' again so that the sequence-sync worker can process it? Given
> REFRESH PUBLICATION SEQUENCES won't be executed very frequently, it
> might be acceptable to transactionally update sequence values.
>

Do you mean that sync the sequences during the REFRESH PUBLICATION
SEQUENCES command itself? If so, there is an argument that we can do
the same during CREATE SUBSCRIPTION. It would be beneficial to keep
the method to sync the sequences same for both the CREATE and REFRESH
commands. I have speculated on one idea above and would be happy to
see your thoughts.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Thu, Jun 13, 2024 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Yeah, starting with a single worker sounds good for now. Do you think
> > > we should sync all the sequences in a single transaction or have some
> > > threshold value above which a different transaction would be required
> > > or maybe a different sequence sync worker altogether? Now, having
> > > multiple sequence-sync workers requires some synchronization so that
> > > only a single worker is allocated for one sequence.
> > >
> > > The simplest thing is to use a single sequence sync worker that syncs
> > > all sequences in one transaction but with a large number of sequences,
> > > it could be inefficient. OTOH, I am not sure if it would be a problem
> > > in reality.
> >
> > I think that we can start with using a single worker and one
> > transaction, and measure the performance with a large number of
> > sequences.
> >
>
> Fair enough. However, this raises the question Dilip and Vignesh are
> discussing whether we need a new relfilenode for sequence update even
> during initial sync? As per my understanding, the idea is that similar
> to tables, the CREATE SUBSCRIPTION command (with copy_data = true)
> will create the new sequence entries in pg_subscription_rel with the
> state as 'i'. Then the sequence-sync worker would start a transaction
> and one-by-one copy the latest sequence values for each sequence (that
> has state as 'i' in pg_subscription_rel) and mark its state as ready
> 'r' and commit the transaction. Now if there is an error during this
> operation it will restart the entire operation. The idea of creating a
> new relfilenode is to handle the error so that if there is a rollback,
> the sequence state will be rolled back to 'i' and the sequence value
> will also be rolled back. The other option could be that we update the
> sequence value without a new relfilenode and if the transaction rolled
> back then only the sequence's state will be rolled back to 'i'. This
> would work with a minor inconsistency that sequence values will be
> up-to-date even when the sequence state is 'i' in pg_subscription_rel.
> I am not sure if that matters because anyway, they can quickly be
> out-of-sync with the publisher again.

I think it would be fine in many cases even if the sequence value is
up-to-date even when the sequence state is 'i' in pg_subscription_rel.
But the case we would like to avoid is where suppose the sequence-sync
worker does both synchronizing sequence values and updating the
sequence states for all sequences in one transaction, and if there is
an error we end up retrying the synchronization for all sequences.

>
> Now, say we don't want to maintain the state of sequences for initial
> sync at all then after the error how will we detect if there are any
> pending sequences to be synced? One possibility is that we maintain a
> subscription level flag 'subsequencesync' in 'pg_subscription' to
> indicate whether sequences need sync. This flag would indicate whether
> to sync all the sequences in pg_susbcription_rel. This would mean that
> if there is an error while syncing the sequences we will resync all
> the sequences again. This could be acceptable considering the chances
> of error during sequence sync are low. The benefit is that both the
> REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
> idea and sync all sequences without needing a new relfilenode. Users
> can always refer 'subsequencesync' flag in 'pg_subscription' to see if
> all the sequences are synced after executing the command.

I think that REFRESH PUBLICATION {SEQUENCES} can be executed even
while the sequence-sync worker is synchronizing sequences. In this
case, the worker might not see new sequences added by the concurrent
REFRESH PUBLICATION {SEQUENCES} command since it's already running.
The worker could end up marking the subsequencesync as completed while
not synchronizing these new sequences.

>
> > > > Or yet another idea I came up with is that a tablesync worker will
> > > > synchronize both the table and sequences owned by the table. That is,
> > > > after the tablesync worker caught up with the apply worker, the
> > > > tablesync worker synchronizes sequences associated with the target
> > > > table as well. One benefit would be that at the time of initial table
> > > > sync being completed, the table and its sequence data are consistent.
> >
> > Correction; it's not guaranteed that the sequence data and table data
> > are consistent even in this case since the tablesync worker could get
> > on-disk sequence data that might have already been updated.
> >
>
> The benefit of this approach is not clear to me. Our aim is to sync
> all sequences before the upgrade, so not sure if this helps because
> anyway both table values and corresponding sequences can again be
> out-of-sync very quickly.

Right.

Given that our aim is to sync all sequences before the upgrade, do we
need to synchronize sequences even at CREATE SUBSCRIPTION time? In
cases where there are a large number of sequences, synchronizing
sequences in addition to tables could be overhead and make less sense,
because sequences can again be out-of-sync quickly and typically
CREATE SUBSCRIPTION is not created just before the upgrade.

>
> > >
> > > > >
> > > > > 5) Incorporate the pg_sequence_state function to fetch the sequence
> > > > > value from the publisher, along with the page LSN. Incorporate
> > > > > SetSequence function, which will procure a new relfilenode for the
> > > > > sequence and set the new relfilenode with the specified value. This
> > > > > will facilitate rollback in case of any failures.
> > > >
> > > > Does it mean that we create a new relfilenode for every update of the value?
> > > >
> > >
> > > We need it for initial sync so that if there is an error both the
> > > sequence state in pg_subscription_rel and sequence values can be
> > > rolled back together.
> >
> > Agreed.
> >
> > > However, it is unclear whether we need to create
> > > a new relfilenode while copying existing sequences (say during ALTER
> > > SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we
> > > decide)? Probably the answer lies in how we want to implement this
> > > command. If we want to copy all sequence values during the command
> > > itself then it is probably okay but if we want to handover this task
> > > to the sequence-sync worker then we need some state management and a
> > > new relfilenode so that on error both state and sequence values are
> > > rolled back.
> >
> > What state transition of pg_subscription_rel entries for sequences do
> > we need while copying sequences values? For example, we insert an
> > entry with 'init' state at CREATE SUBSCRIPTION and then the
> > sequence-sync worker updates to 'ready' and copies the sequence data.
> > And at REFRESH PUBLICATION SEQUENCES, we update the state back to
> > 'init' again so that the sequence-sync worker can process it? Given
> > REFRESH PUBLICATION SEQUENCES won't be executed very frequently, it
> > might be acceptable to transactionally update sequence values.
> >
>
> Do you mean that sync the sequences during the REFRESH PUBLICATION
> SEQUENCES command itself? If so, there is an argument that we can do
> the same during CREATE SUBSCRIPTION. It would be beneficial to keep
> the method to sync the sequences same for both the CREATE and REFRESH
> commands. I have speculated on one idea above and would be happy to
> see your thoughts.

I meant that the REFRESH PUBLICATION SEQUENCES command updates all
sequence states in pg_subscription_rel to 'init' state, and the
sequence-sync worker can do the synchronization work. We use the same
method for both the CREATE SUBSCRIPTION and REFRESH PUBLICATION
{SEQUENCES} commands.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Michael Paquier
Дата:
On Thu, Jun 13, 2024 at 03:36:05PM +0530, Amit Kapila wrote:
> Fair enough. However, this raises the question Dilip and Vignesh are
> discussing whether we need a new relfilenode for sequence update even
> during initial sync? As per my understanding, the idea is that similar
> to tables, the CREATE SUBSCRIPTION command (with copy_data = true)
> will create the new sequence entries in pg_subscription_rel with the
> state as 'i'. Then the sequence-sync worker would start a transaction
> and one-by-one copy the latest sequence values for each sequence (that
> has state as 'i' in pg_subscription_rel) and mark its state as ready
> 'r' and commit the transaction. Now if there is an error during this
> operation it will restart the entire operation.

Hmm.  You mean to use only one transaction for all the sequences?
I've heard about deployments with a lot of them.  Could it be a
problem to process them in batches, as well?  If you maintain a state
for each one of them in pg_subscription_rel, it does not strike me as
an issue, while being more flexible than an all-or-nothing.

> The idea of creating a
> new relfilenode is to handle the error so that if there is a rollback,
> the sequence state will be rolled back to 'i' and the sequence value
> will also be rolled back. The other option could be that we update the
> sequence value without a new relfilenode and if the transaction rolled
> back then only the sequence's state will be rolled back to 'i'. This
> would work with a minor inconsistency that sequence values will be
> up-to-date even when the sequence state is 'i' in pg_subscription_rel.
> I am not sure if that matters because anyway, they can quickly be
> out-of-sync with the publisher again.

Seeing a mention to relfilenodes specifically for sequences freaks me
out a bit, because there's some work I have been doing in this area
and sequences may not have a need for a physical relfilenode at all.
But I guess that you refer to the fact that like tables, relfilenodes
would only be created as required because anything you'd do in the
apply worker path would just call some of the routines of sequence.h,
right?

> Now, say we don't want to maintain the state of sequences for initial
> sync at all then after the error how will we detect if there are any
> pending sequences to be synced? One possibility is that we maintain a
> subscription level flag 'subsequencesync' in 'pg_subscription' to
> indicate whether sequences need sync. This flag would indicate whether
> to sync all the sequences in pg_susbcription_rel. This would mean that
> if there is an error while syncing the sequences we will resync all
> the sequences again. This could be acceptable considering the chances
> of error during sequence sync are low.

There could be multiple subscriptions to a single database that point
to the same set of sequences.  Is there any conflict issue to worry
about here?

> The benefit is that both the
> REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
> idea and sync all sequences without needing a new relfilenode. Users
> can always refer 'subsequencesync' flag in 'pg_subscription' to see if
> all the sequences are synced after executing the command.

That would be cheaper, indeed.  Isn't a boolean too limiting?
Isn't that something you'd want to track with a LSN as "the point in
WAL where all the sequences have been synced"?

The approach of doing all the sync work from the subscriber, while
having a command that can be kicked from the subscriber side is a good
user experience.
--
Michael

Вложения

Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Thu, Jun 13, 2024 at 6:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jun 13, 2024 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Yeah, starting with a single worker sounds good for now. Do you think
> > > > we should sync all the sequences in a single transaction or have some
> > > > threshold value above which a different transaction would be required
> > > > or maybe a different sequence sync worker altogether? Now, having
> > > > multiple sequence-sync workers requires some synchronization so that
> > > > only a single worker is allocated for one sequence.
> > > >
> > > > The simplest thing is to use a single sequence sync worker that syncs
> > > > all sequences in one transaction but with a large number of sequences,
> > > > it could be inefficient. OTOH, I am not sure if it would be a problem
> > > > in reality.
> > >
> > > I think that we can start with using a single worker and one
> > > transaction, and measure the performance with a large number of
> > > sequences.
> > >
> >
> > Fair enough. However, this raises the question Dilip and Vignesh are
> > discussing whether we need a new relfilenode for sequence update even
> > during initial sync? As per my understanding, the idea is that similar
> > to tables, the CREATE SUBSCRIPTION command (with copy_data = true)
> > will create the new sequence entries in pg_subscription_rel with the
> > state as 'i'. Then the sequence-sync worker would start a transaction
> > and one-by-one copy the latest sequence values for each sequence (that
> > has state as 'i' in pg_subscription_rel) and mark its state as ready
> > 'r' and commit the transaction. Now if there is an error during this
> > operation it will restart the entire operation. The idea of creating a
> > new relfilenode is to handle the error so that if there is a rollback,
> > the sequence state will be rolled back to 'i' and the sequence value
> > will also be rolled back. The other option could be that we update the
> > sequence value without a new relfilenode and if the transaction rolled
> > back then only the sequence's state will be rolled back to 'i'. This
> > would work with a minor inconsistency that sequence values will be
> > up-to-date even when the sequence state is 'i' in pg_subscription_rel.
> > I am not sure if that matters because anyway, they can quickly be
> > out-of-sync with the publisher again.
>
> I think it would be fine in many cases even if the sequence value is
> up-to-date even when the sequence state is 'i' in pg_subscription_rel.
> But the case we would like to avoid is where suppose the sequence-sync
> worker does both synchronizing sequence values and updating the
> sequence states for all sequences in one transaction, and if there is
> an error we end up retrying the synchronization for all sequences.
>

The one idea to avoid this is to update sequences in chunks (say 100
or some threshold number of sequences in one transaction). Then we
would only redo the sync for the last and pending set of sequences.

> >
> > Now, say we don't want to maintain the state of sequences for initial
> > sync at all then after the error how will we detect if there are any
> > pending sequences to be synced? One possibility is that we maintain a
> > subscription level flag 'subsequencesync' in 'pg_subscription' to
> > indicate whether sequences need sync. This flag would indicate whether
> > to sync all the sequences in pg_susbcription_rel. This would mean that
> > if there is an error while syncing the sequences we will resync all
> > the sequences again. This could be acceptable considering the chances
> > of error during sequence sync are low. The benefit is that both the
> > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
> > idea and sync all sequences without needing a new relfilenode. Users
> > can always refer 'subsequencesync' flag in 'pg_subscription' to see if
> > all the sequences are synced after executing the command.
>
> I think that REFRESH PUBLICATION {SEQUENCES} can be executed even
> while the sequence-sync worker is synchronizing sequences. In this
> case, the worker might not see new sequences added by the concurrent
> REFRESH PUBLICATION {SEQUENCES} command since it's already running.
> The worker could end up marking the subsequencesync as completed while
> not synchronizing these new sequences.
>

This is possible but we could avoid REFRESH PUBLICATION {SEQUENCES} by
not allowing to change the subsequencestate during the time
sequence-worker is syncing the sequences. This could be restrictive
but there doesn't seem to be cases where user would like to
immediately refresh sequences after creating the subscription.

> >
> > > > > Or yet another idea I came up with is that a tablesync worker will
> > > > > synchronize both the table and sequences owned by the table. That is,
> > > > > after the tablesync worker caught up with the apply worker, the
> > > > > tablesync worker synchronizes sequences associated with the target
> > > > > table as well. One benefit would be that at the time of initial table
> > > > > sync being completed, the table and its sequence data are consistent.
> > >
> > > Correction; it's not guaranteed that the sequence data and table data
> > > are consistent even in this case since the tablesync worker could get
> > > on-disk sequence data that might have already been updated.
> > >
> >
> > The benefit of this approach is not clear to me. Our aim is to sync
> > all sequences before the upgrade, so not sure if this helps because
> > anyway both table values and corresponding sequences can again be
> > out-of-sync very quickly.
>
> Right.
>
> Given that our aim is to sync all sequences before the upgrade, do we
> need to synchronize sequences even at CREATE SUBSCRIPTION time? In
> cases where there are a large number of sequences, synchronizing
> sequences in addition to tables could be overhead and make less sense,
> because sequences can again be out-of-sync quickly and typically
> CREATE SUBSCRIPTION is not created just before the upgrade.
>

I think for the upgrade one should be creating a subscription just
before the upgrade. Isn't something similar is done even in the
upgrade steps you shared once [1]? Typically users should get all the
data from the publisher before the upgrade of the publisher via
creating a subscription. Also, it would be better to keep the
implementation of sequences close to tables wherever possible. Having
said that, I understand your point as well and if you strongly feel
that we don't need to sync sequences at the time of CREATE
SUBSCRIPTION and others also don't see any problem with it then we can
consider that as well.

> > >
> >
> > Do you mean that sync the sequences during the REFRESH PUBLICATION
> > SEQUENCES command itself? If so, there is an argument that we can do
> > the same during CREATE SUBSCRIPTION. It would be beneficial to keep
> > the method to sync the sequences same for both the CREATE and REFRESH
> > commands. I have speculated on one idea above and would be happy to
> > see your thoughts.
>
> I meant that the REFRESH PUBLICATION SEQUENCES command updates all
> sequence states in pg_subscription_rel to 'init' state, and the
> sequence-sync worker can do the synchronization work. We use the same
> method for both the CREATE SUBSCRIPTION and REFRESH PUBLICATION
> {SEQUENCES} commands.
>

Marking the state as 'init' when we would have already synced the
sequences sounds a bit odd but otherwise, this could also work if we
accept that even if the sequences are synced and value could remain in
'init' state (on rollbacks).


[1] - https://knock.app/blog/zero-downtime-postgres-upgrades

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Fri, Jun 14, 2024 at 5:16 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Thu, Jun 13, 2024 at 03:36:05PM +0530, Amit Kapila wrote:
> > Fair enough. However, this raises the question Dilip and Vignesh are
> > discussing whether we need a new relfilenode for sequence update even
> > during initial sync? As per my understanding, the idea is that similar
> > to tables, the CREATE SUBSCRIPTION command (with copy_data = true)
> > will create the new sequence entries in pg_subscription_rel with the
> > state as 'i'. Then the sequence-sync worker would start a transaction
> > and one-by-one copy the latest sequence values for each sequence (that
> > has state as 'i' in pg_subscription_rel) and mark its state as ready
> > 'r' and commit the transaction. Now if there is an error during this
> > operation it will restart the entire operation.
>
> Hmm.  You mean to use only one transaction for all the sequences?
> I've heard about deployments with a lot of them.  Could it be a
> problem to process them in batches, as well?

I don't think so. We can even sync one sequence per transaction but
then it would be resource and time consuming without much gain. As
mentioned in a previous email, we might want to sync 100 or some other
threshold number of sequences per transaction. The other possibility
is to make a subscription-level option for this batch size but I don't
see much advantage in doing so as it won't be convenient for users to
set it. I feel we should pick some threshold number that is neither
too low nor too high and if we later see any problem with it, we can
make it a configurable knob.

>
> > The idea of creating a
> > new relfilenode is to handle the error so that if there is a rollback,
> > the sequence state will be rolled back to 'i' and the sequence value
> > will also be rolled back. The other option could be that we update the
> > sequence value without a new relfilenode and if the transaction rolled
> > back then only the sequence's state will be rolled back to 'i'. This
> > would work with a minor inconsistency that sequence values will be
> > up-to-date even when the sequence state is 'i' in pg_subscription_rel.
> > I am not sure if that matters because anyway, they can quickly be
> > out-of-sync with the publisher again.
>
> Seeing a mention to relfilenodes specifically for sequences freaks me
> out a bit, because there's some work I have been doing in this area
> and sequences may not have a need for a physical relfilenode at all.
> But I guess that you refer to the fact that like tables, relfilenodes
> would only be created as required because anything you'd do in the
> apply worker path would just call some of the routines of sequence.h,
> right?
>

Yes, I think so. The only thing the patch expects is a way to rollback
the sequence changes if the transaction rolls back during the initial
sync. But I am not sure if we need such a behavior. The discussion for
the same is in progress. Let's wait for the outcome.

> > Now, say we don't want to maintain the state of sequences for initial
> > sync at all then after the error how will we detect if there are any
> > pending sequences to be synced? One possibility is that we maintain a
> > subscription level flag 'subsequencesync' in 'pg_subscription' to
> > indicate whether sequences need sync. This flag would indicate whether
> > to sync all the sequences in pg_susbcription_rel. This would mean that
> > if there is an error while syncing the sequences we will resync all
> > the sequences again. This could be acceptable considering the chances
> > of error during sequence sync are low.
>
> There could be multiple subscriptions to a single database that point
> to the same set of sequences.  Is there any conflict issue to worry
> about here?
>

I don't think so. In the worst case, the same value would be copied
twice. The same scenario in case of tables could lead to duplicate
data or unique key violation ERRORs which is much worse. So, I expect
users to be careful about the same.

> > The benefit is that both the
> > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
> > idea and sync all sequences without needing a new relfilenode. Users
> > can always refer 'subsequencesync' flag in 'pg_subscription' to see if
> > all the sequences are synced after executing the command.
>
> That would be cheaper, indeed.  Isn't a boolean too limiting?
>

In this idea, we only need a flag to say whether the sequence sync is
required or not.

> Isn't that something you'd want to track with a LSN as "the point in
> WAL where all the sequences have been synced"?
>

It won't be any better for the required purpose because after CREATE
SUBSCRIPTION, if REFERESH wants to toggle the flag to indicate the
sequences need sync again then using LSN would mean we need to set it
to Invalid value.

> The approach of doing all the sync work from the subscriber, while
> having a command that can be kicked from the subscriber side is a good
> user experience.
>

Thank you for endorsing the idea.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
Masahiko Sawada
Дата:
On Fri, Jun 14, 2024 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jun 13, 2024 at 6:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Jun 13, 2024 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > Yeah, starting with a single worker sounds good for now. Do you think
> > > > > we should sync all the sequences in a single transaction or have some
> > > > > threshold value above which a different transaction would be required
> > > > > or maybe a different sequence sync worker altogether? Now, having
> > > > > multiple sequence-sync workers requires some synchronization so that
> > > > > only a single worker is allocated for one sequence.
> > > > >
> > > > > The simplest thing is to use a single sequence sync worker that syncs
> > > > > all sequences in one transaction but with a large number of sequences,
> > > > > it could be inefficient. OTOH, I am not sure if it would be a problem
> > > > > in reality.
> > > >
> > > > I think that we can start with using a single worker and one
> > > > transaction, and measure the performance with a large number of
> > > > sequences.
> > > >
> > >
> > > Fair enough. However, this raises the question Dilip and Vignesh are
> > > discussing whether we need a new relfilenode for sequence update even
> > > during initial sync? As per my understanding, the idea is that similar
> > > to tables, the CREATE SUBSCRIPTION command (with copy_data = true)
> > > will create the new sequence entries in pg_subscription_rel with the
> > > state as 'i'. Then the sequence-sync worker would start a transaction
> > > and one-by-one copy the latest sequence values for each sequence (that
> > > has state as 'i' in pg_subscription_rel) and mark its state as ready
> > > 'r' and commit the transaction. Now if there is an error during this
> > > operation it will restart the entire operation. The idea of creating a
> > > new relfilenode is to handle the error so that if there is a rollback,
> > > the sequence state will be rolled back to 'i' and the sequence value
> > > will also be rolled back. The other option could be that we update the
> > > sequence value without a new relfilenode and if the transaction rolled
> > > back then only the sequence's state will be rolled back to 'i'. This
> > > would work with a minor inconsistency that sequence values will be
> > > up-to-date even when the sequence state is 'i' in pg_subscription_rel.
> > > I am not sure if that matters because anyway, they can quickly be
> > > out-of-sync with the publisher again.
> >
> > I think it would be fine in many cases even if the sequence value is
> > up-to-date even when the sequence state is 'i' in pg_subscription_rel.
> > But the case we would like to avoid is where suppose the sequence-sync
> > worker does both synchronizing sequence values and updating the
> > sequence states for all sequences in one transaction, and if there is
> > an error we end up retrying the synchronization for all sequences.
> >
>
> The one idea to avoid this is to update sequences in chunks (say 100
> or some threshold number of sequences in one transaction). Then we
> would only redo the sync for the last and pending set of sequences.

That could be one idea.

>
> > >
> > > Now, say we don't want to maintain the state of sequences for initial
> > > sync at all then after the error how will we detect if there are any
> > > pending sequences to be synced? One possibility is that we maintain a
> > > subscription level flag 'subsequencesync' in 'pg_subscription' to
> > > indicate whether sequences need sync. This flag would indicate whether
> > > to sync all the sequences in pg_susbcription_rel. This would mean that
> > > if there is an error while syncing the sequences we will resync all
> > > the sequences again. This could be acceptable considering the chances
> > > of error during sequence sync are low. The benefit is that both the
> > > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
> > > idea and sync all sequences without needing a new relfilenode. Users
> > > can always refer 'subsequencesync' flag in 'pg_subscription' to see if
> > > all the sequences are synced after executing the command.
> >
> > I think that REFRESH PUBLICATION {SEQUENCES} can be executed even
> > while the sequence-sync worker is synchronizing sequences. In this
> > case, the worker might not see new sequences added by the concurrent
> > REFRESH PUBLICATION {SEQUENCES} command since it's already running.
> > The worker could end up marking the subsequencesync as completed while
> > not synchronizing these new sequences.
> >
>
> This is possible but we could avoid REFRESH PUBLICATION {SEQUENCES} by
> not allowing to change the subsequencestate during the time
> sequence-worker is syncing the sequences. This could be restrictive
> but there doesn't seem to be cases where user would like to
> immediately refresh sequences after creating the subscription.

I'm concerned that users would not be able to add sequences during the
time the sequence-worker is syncing the sequences. For example,
suppose we have 10000 sequences and execute REFRESH PUBLICATION
{SEQUENCES} to synchronize 10000 sequences. Now if we add one sequence
to the publication and want to synchronize it to the subscriber, we
have to wait for the current REFRESH PUBLICATION {SEQUENCES} to
complete, and then execute it again, synchronizing 10001 sequences,
instead of synchronizing only the new one.

>
> > >
> > > > > > Or yet another idea I came up with is that a tablesync worker will
> > > > > > synchronize both the table and sequences owned by the table. That is,
> > > > > > after the tablesync worker caught up with the apply worker, the
> > > > > > tablesync worker synchronizes sequences associated with the target
> > > > > > table as well. One benefit would be that at the time of initial table
> > > > > > sync being completed, the table and its sequence data are consistent.
> > > >
> > > > Correction; it's not guaranteed that the sequence data and table data
> > > > are consistent even in this case since the tablesync worker could get
> > > > on-disk sequence data that might have already been updated.
> > > >
> > >
> > > The benefit of this approach is not clear to me. Our aim is to sync
> > > all sequences before the upgrade, so not sure if this helps because
> > > anyway both table values and corresponding sequences can again be
> > > out-of-sync very quickly.
> >
> > Right.
> >
> > Given that our aim is to sync all sequences before the upgrade, do we
> > need to synchronize sequences even at CREATE SUBSCRIPTION time? In
> > cases where there are a large number of sequences, synchronizing
> > sequences in addition to tables could be overhead and make less sense,
> > because sequences can again be out-of-sync quickly and typically
> > CREATE SUBSCRIPTION is not created just before the upgrade.
> >
>
> I think for the upgrade one should be creating a subscription just
> before the upgrade. Isn't something similar is done even in the
> upgrade steps you shared once [1]?

I might be missing something but in the blog post they created
subscriptions in various ways, waited for the initial table data sync
to complete, and then set the sequence values with a buffer based on
the old cluster. What I imagined with this sequence synchronization
feature is that after the initial table sync completes, we stop to
execute further transactions on the publisher, synchronize sequences
using REFRESH PUBLICATION {SEQUENCES}, and resume the application to
execute transactions on the subscriber. So a subscription would be
created just before the upgrade, but sequence synchronization would
not necessarily happen at the same time of the initial table data
synchronization.

> Typically users should get all the
> data from the publisher before the upgrade of the publisher via
> creating a subscription. Also, it would be better to keep the
> implementation of sequences close to tables wherever possible. Having
> said that, I understand your point as well and if you strongly feel
> that we don't need to sync sequences at the time of CREATE
> SUBSCRIPTION and others also don't see any problem with it then we can
> consider that as well.

I see your point that it's better to keep the implementation of
sequences close to the table one. So I agree that we can start with
this approach, and we will see how it works in practice and consider
other options later.

>
> > > >
> > >
> > > Do you mean that sync the sequences during the REFRESH PUBLICATION
> > > SEQUENCES command itself? If so, there is an argument that we can do
> > > the same during CREATE SUBSCRIPTION. It would be beneficial to keep
> > > the method to sync the sequences same for both the CREATE and REFRESH
> > > commands. I have speculated on one idea above and would be happy to
> > > see your thoughts.
> >
> > I meant that the REFRESH PUBLICATION SEQUENCES command updates all
> > sequence states in pg_subscription_rel to 'init' state, and the
> > sequence-sync worker can do the synchronization work. We use the same
> > method for both the CREATE SUBSCRIPTION and REFRESH PUBLICATION
> > {SEQUENCES} commands.
> >
>
> Marking the state as 'init' when we would have already synced the
> sequences sounds a bit odd but otherwise, this could also work if we
> accept that even if the sequences are synced and value could remain in
> 'init' state (on rollbacks).

I mean that it's just for identifying sequences that need to be
synced. With the idea of using sequence states in pg_subscription_rel,
the REFRESH PUBLICATION SEQUENCES command needs to change states to
something so that the sequence-sync worker can identify which sequence
needs to be synced. If 'init' sounds odd, we can invent a new state
for sequences, say 'needs-to-be-syned'.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Tue, Jun 18, 2024 at 7:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jun 14, 2024 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Jun 13, 2024 at 6:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> >
> > > >
> > > > Now, say we don't want to maintain the state of sequences for initial
> > > > sync at all then after the error how will we detect if there are any
> > > > pending sequences to be synced? One possibility is that we maintain a
> > > > subscription level flag 'subsequencesync' in 'pg_subscription' to
> > > > indicate whether sequences need sync. This flag would indicate whether
> > > > to sync all the sequences in pg_susbcription_rel. This would mean that
> > > > if there is an error while syncing the sequences we will resync all
> > > > the sequences again. This could be acceptable considering the chances
> > > > of error during sequence sync are low. The benefit is that both the
> > > > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same
> > > > idea and sync all sequences without needing a new relfilenode. Users
> > > > can always refer 'subsequencesync' flag in 'pg_subscription' to see if
> > > > all the sequences are synced after executing the command.
> > >
> > > I think that REFRESH PUBLICATION {SEQUENCES} can be executed even
> > > while the sequence-sync worker is synchronizing sequences. In this
> > > case, the worker might not see new sequences added by the concurrent
> > > REFRESH PUBLICATION {SEQUENCES} command since it's already running.
> > > The worker could end up marking the subsequencesync as completed while
> > > not synchronizing these new sequences.
> > >
> >
> > This is possible but we could avoid REFRESH PUBLICATION {SEQUENCES} by
> > not allowing to change the subsequencestate during the time
> > sequence-worker is syncing the sequences. This could be restrictive
> > but there doesn't seem to be cases where user would like to
> > immediately refresh sequences after creating the subscription.
>
> I'm concerned that users would not be able to add sequences during the
> time the sequence-worker is syncing the sequences. For example,
> suppose we have 10000 sequences and execute REFRESH PUBLICATION
> {SEQUENCES} to synchronize 10000 sequences. Now if we add one sequence
> to the publication and want to synchronize it to the subscriber, we
> have to wait for the current REFRESH PUBLICATION {SEQUENCES} to
> complete, and then execute it again, synchronizing 10001 sequences,
> instead of synchronizing only the new one.
>

I see your point and it could hurt such scenarios even though they
won't be frequent. So, let's focus on our other approach of
maintaining the flag at a per-sequence level in pg_subscription_rel.

> >
> > > >
> > > > > > > Or yet another idea I came up with is that a tablesync worker will
> > > > > > > synchronize both the table and sequences owned by the table. That is,
> > > > > > > after the tablesync worker caught up with the apply worker, the
> > > > > > > tablesync worker synchronizes sequences associated with the target
> > > > > > > table as well. One benefit would be that at the time of initial table
> > > > > > > sync being completed, the table and its sequence data are consistent.
> > > > >
> > > > > Correction; it's not guaranteed that the sequence data and table data
> > > > > are consistent even in this case since the tablesync worker could get
> > > > > on-disk sequence data that might have already been updated.
> > > > >
> > > >
> > > > The benefit of this approach is not clear to me. Our aim is to sync
> > > > all sequences before the upgrade, so not sure if this helps because
> > > > anyway both table values and corresponding sequences can again be
> > > > out-of-sync very quickly.
> > >
> > > Right.
> > >
> > > Given that our aim is to sync all sequences before the upgrade, do we
> > > need to synchronize sequences even at CREATE SUBSCRIPTION time? In
> > > cases where there are a large number of sequences, synchronizing
> > > sequences in addition to tables could be overhead and make less sense,
> > > because sequences can again be out-of-sync quickly and typically
> > > CREATE SUBSCRIPTION is not created just before the upgrade.
> > >
> >
> > I think for the upgrade one should be creating a subscription just
> > before the upgrade. Isn't something similar is done even in the
> > upgrade steps you shared once [1]?
>
> I might be missing something but in the blog post they created
> subscriptions in various ways, waited for the initial table data sync
> to complete, and then set the sequence values with a buffer based on
> the old cluster. What I imagined with this sequence synchronization
> feature is that after the initial table sync completes, we stop to
> execute further transactions on the publisher, synchronize sequences
> using REFRESH PUBLICATION {SEQUENCES}, and resume the application to
> execute transactions on the subscriber. So a subscription would be
> created just before the upgrade, but sequence synchronization would
> not necessarily happen at the same time of the initial table data
> synchronization.
>

It depends on the exact steps of the upgrade. For example, if one
stops the publisher before adding sequences to a subscription either
via create subscription or alter subscription add/set command then
there won't be a need for a separate refresh but OTOH, if one follows
the steps you mentioned then the refresh would be required. As you are
okay, with syncing the sequences while creating a subscription in the
below part of the email, there is not much point in arguing about this
further.

> > Typically users should get all the
> > data from the publisher before the upgrade of the publisher via
> > creating a subscription. Also, it would be better to keep the
> > implementation of sequences close to tables wherever possible. Having
> > said that, I understand your point as well and if you strongly feel
> > that we don't need to sync sequences at the time of CREATE
> > SUBSCRIPTION and others also don't see any problem with it then we can
> > consider that as well.
>
> I see your point that it's better to keep the implementation of
> sequences close to the table one. So I agree that we can start with
> this approach, and we will see how it works in practice and consider
> other options later.
>

makes sense.

> >
> > Marking the state as 'init' when we would have already synced the
> > sequences sounds a bit odd but otherwise, this could also work if we
> > accept that even if the sequences are synced and value could remain in
> > 'init' state (on rollbacks).
>
> I mean that it's just for identifying sequences that need to be
> synced. With the idea of using sequence states in pg_subscription_rel,
> the REFRESH PUBLICATION SEQUENCES command needs to change states to
> something so that the sequence-sync worker can identify which sequence
> needs to be synced. If 'init' sounds odd, we can invent a new state
> for sequences, say 'needs-to-be-syned'.
>

Agreed and I am not sure which is better because there is a value in
keeping the state name the same for both sequences and tables. We
probably need more comments in code and doc updates to make the
behavior clear. We can start with the sequence state as 'init' for
'needs-to-be-sycned' and 'ready' for 'synced' and can change if others
feel so during the review.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> Agreed and I am not sure which is better because there is a value in
> keeping the state name the same for both sequences and tables. We
> probably need more comments in code and doc updates to make the
> behavior clear. We can start with the sequence state as 'init' for
> 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others
> feel so during the review.

Here is a patch which does the sequence synchronization in the
following lines from the above discussion:
This commit introduces sequence synchronization during 1) creation of
subscription for initial sync of sequences 2) refresh publication to
synchronize the sequences for the newly created sequences 3) refresh
publication sequences for synchronizing all the sequences.
1) During subscription creation with CREATE SUBSCRIPTION (no syntax change):
   - The subscriber retrieves sequences associated with publications.
   - Sequences  are added in the 'init' state to the pg_subscription_rel table.
   - Sequence synchronization worker will be started if there are any
sequences to be synchronized
   - A new sequence synchronization worker handles synchronization in
batches of 100 sequences:
     a) Retrieves sequence values using pg_sequence_state from the publisher.
     b) Sets sequence values accordingly.
     c) Updates sequence state to 'READY' in pg_susbcripion_rel
     d) Commits batches of 100 synchronized sequences.
2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH
PUBLICATION (no syntax change):
   - Stale sequences are removed from pg_subscription_rel.
   - Newly added sequences in the publisher are added in 'init' state
to pg_subscription_rel.
   - Sequence synchronization will be done by sequence sync worker as
listed in subscription creation process.
   - Sequence synchronization occurs for newly added sequences only.
3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION
SEQUENCES  for refreshing all sequences:
   - Removes stale sequences and adds newly added sequences from the
publisher to pg_subscription_rel.
   - Resets all sequences in pg_subscription_rel to 'init' state.
   - Initiates sequence synchronization for all sequences by sequence
sync worker as listed in subscription creation process.

Regards,
Vignesh

Вложения

Re: Logical Replication of sequences

От
vignesh C
Дата:
On Wed, 19 Jun 2024 at 20:33, vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Agreed and I am not sure which is better because there is a value in
> > keeping the state name the same for both sequences and tables. We
> > probably need more comments in code and doc updates to make the
> > behavior clear. We can start with the sequence state as 'init' for
> > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others
> > feel so during the review.
>
> Here is a patch which does the sequence synchronization in the
> following lines from the above discussion:
> This commit introduces sequence synchronization during 1) creation of
> subscription for initial sync of sequences 2) refresh publication to
> synchronize the sequences for the newly created sequences 3) refresh
> publication sequences for synchronizing all the sequences.
> 1) During subscription creation with CREATE SUBSCRIPTION (no syntax change):
>    - The subscriber retrieves sequences associated with publications.
>    - Sequences  are added in the 'init' state to the pg_subscription_rel table.
>    - Sequence synchronization worker will be started if there are any
> sequences to be synchronized
>    - A new sequence synchronization worker handles synchronization in
> batches of 100 sequences:
>      a) Retrieves sequence values using pg_sequence_state from the publisher.
>      b) Sets sequence values accordingly.
>      c) Updates sequence state to 'READY' in pg_susbcripion_rel
>      d) Commits batches of 100 synchronized sequences.
> 2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH
> PUBLICATION (no syntax change):
>    - Stale sequences are removed from pg_subscription_rel.
>    - Newly added sequences in the publisher are added in 'init' state
> to pg_subscription_rel.
>    - Sequence synchronization will be done by sequence sync worker as
> listed in subscription creation process.
>    - Sequence synchronization occurs for newly added sequences only.
> 3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION
> SEQUENCES  for refreshing all sequences:
>    - Removes stale sequences and adds newly added sequences from the
> publisher to pg_subscription_rel.
>    - Resets all sequences in pg_subscription_rel to 'init' state.
>    - Initiates sequence synchronization for all sequences by sequence
> sync worker as listed in subscription creation process.

Here is an updated patch with a few fixes to remove an unused
function, changed a few references of table to sequence and added one
CHECK_FOR_INTERRUPTS in the sequence sync worker loop.

Regards,
Vignesh

Вложения

Re: Logical Replication of sequences

От
Amit Kapila
Дата:
On Wed, Jun 19, 2024 at 8:33 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Agreed and I am not sure which is better because there is a value in
> > keeping the state name the same for both sequences and tables. We
> > probably need more comments in code and doc updates to make the
> > behavior clear. We can start with the sequence state as 'init' for
> > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others
> > feel so during the review.
>
> Here is a patch which does the sequence synchronization in the
> following lines from the above discussion:
>

Thanks for summarizing the points discussed. I would like to confirm
whether the patch replicates new sequences that are created
implicitly/explicitly for a publication defined as ALL SEQUENCES.

--
With Regards,
Amit Kapila.



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Thu, 20 Jun 2024 at 18:45, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 19, 2024 at 8:33 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Agreed and I am not sure which is better because there is a value in
> > > keeping the state name the same for both sequences and tables. We
> > > probably need more comments in code and doc updates to make the
> > > behavior clear. We can start with the sequence state as 'init' for
> > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others
> > > feel so during the review.
> >
> > Here is a patch which does the sequence synchronization in the
> > following lines from the above discussion:
> >
>
> Thanks for summarizing the points discussed. I would like to confirm
> whether the patch replicates new sequences that are created
> implicitly/explicitly for a publication defined as ALL SEQUENCES.

Currently, FOR ALL SEQUENCES publication both explicitly created
sequences and implicitly created sequences will be synchronized during
the creation of subscriptions (using CREATE SUBSCRIPTION) and
refreshing publication sequences(using ALTER SUBSCRIPTION ... REFRESH
PUBLICATION SEQUENCES).
Therefore, the explicitly created sequence seq1:
CREATE SEQUENCE seq1;
and the implicitly created sequence seq_test2_c2_seq for seq_test2 table:
CREATE TABLE seq_test2 (c1 int, c2 SERIAL);
will both be synchronized.

Regards,
Vignesh



Re: Logical Replication of sequences

От
Shlok Kyal
Дата:
On Thu, 20 Jun 2024 at 18:24, vignesh C <vignesh21@gmail.com> wrote:
>
> On Wed, 19 Jun 2024 at 20:33, vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Agreed and I am not sure which is better because there is a value in
> > > keeping the state name the same for both sequences and tables. We
> > > probably need more comments in code and doc updates to make the
> > > behavior clear. We can start with the sequence state as 'init' for
> > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others
> > > feel so during the review.
> >
> > Here is a patch which does the sequence synchronization in the
> > following lines from the above discussion:
> > This commit introduces sequence synchronization during 1) creation of
> > subscription for initial sync of sequences 2) refresh publication to
> > synchronize the sequences for the newly created sequences 3) refresh
> > publication sequences for synchronizing all the sequences.
> > 1) During subscription creation with CREATE SUBSCRIPTION (no syntax change):
> >    - The subscriber retrieves sequences associated with publications.
> >    - Sequences  are added in the 'init' state to the pg_subscription_rel table.
> >    - Sequence synchronization worker will be started if there are any
> > sequences to be synchronized
> >    - A new sequence synchronization worker handles synchronization in
> > batches of 100 sequences:
> >      a) Retrieves sequence values using pg_sequence_state from the publisher.
> >      b) Sets sequence values accordingly.
> >      c) Updates sequence state to 'READY' in pg_susbcripion_rel
> >      d) Commits batches of 100 synchronized sequences.
> > 2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH
> > PUBLICATION (no syntax change):
> >    - Stale sequences are removed from pg_subscription_rel.
> >    - Newly added sequences in the publisher are added in 'init' state
> > to pg_subscription_rel.
> >    - Sequence synchronization will be done by sequence sync worker as
> > listed in subscription creation process.
> >    - Sequence synchronization occurs for newly added sequences only.
> > 3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION
> > SEQUENCES  for refreshing all sequences:
> >    - Removes stale sequences and adds newly added sequences from the
> > publisher to pg_subscription_rel.
> >    - Resets all sequences in pg_subscription_rel to 'init' state.
> >    - Initiates sequence synchronization for all sequences by sequence
> > sync worker as listed in subscription creation process.
>
> Here is an updated patch with a few fixes to remove an unused
> function, changed a few references of table to sequence and added one
> CHECK_FOR_INTERRUPTS in the sequence sync worker loop.

Hi Vignesh,

I have reviewed the patches and I have following comments:

===== tablesync.c ======
1. process_syncing_sequences_for_apply can crash with:
2024-06-21 15:25:17.208 IST [3681269] LOG:  logical replication apply
worker for subscription "test1" has started
2024-06-21 15:28:10.127 IST [3682329] LOG:  logical replication
sequences synchronization worker for subscription "test1" has started
2024-06-21 15:28:10.146 IST [3682329] LOG:  logical replication
synchronization for subscription "test1", sequence "s1" has finished
2024-06-21 15:28:10.149 IST [3682329] LOG:  logical replication
synchronization for subscription "test1", sequence "s2" has finished
2024-06-21 15:28:10.149 IST [3682329] LOG:  logical replication
sequences synchronization worker for subscription "test1" has finished
2024-06-21 15:29:53.535 IST [3682767] LOG:  logical replication
sequences synchronization worker for subscription "test1" has started
TRAP: failed Assert("nestLevel > 0 && (nestLevel <= GUCNestLevel ||
(nestLevel == GUCNestLevel + 1 && !isCommit))"), File: "guc.c", Line:
2273, PID: 3682767
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (ExceptionalCondition+0xbb)[0x5b2a61861c99]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (AtEOXact_GUC+0x7b)[0x5b2a618bddfa]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (RestoreUserContext+0xc7)[0x5b2a618a6937]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1ff7dfa)[0x5b2a61115dfa]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1ff7eb4)[0x5b2a61115eb4]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (SequencesyncWorkerMain+0x33)[0x5b2a61115fe7]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (BackgroundWorkerMain+0x4ad)[0x5b2a61029cae]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (postmaster_child_launch+0x236)[0x5b2a6102fb36]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1f1d12a)[0x5b2a6103b12a]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1f1df0f)[0x5b2a6103bf0f]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1f1bf71)[0x5b2a61039f71]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1f16f73)[0x5b2a61034f73]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (PostmasterMain+0x18fb)[0x5b2a61034445]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (+0x1ab1ab8)[0x5b2a60bcfab8]
/lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7b76bc629d90]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7b76bc629e40]
postgres: logical replication sequencesync worker for subscription
16389 sync 0 (_start+0x25)[0x5b2a601491a5]

Analysis:
Suppose there are two sequences (s1, s2) on publisher.
SO, during initial sync.
in loop,
+ foreach(lc, table_states_not_ready)

table_states_not_ready -> it contains both s1 and s2.
So, for s1 a sequence sync will be started. It will sync all sequences
and the sequence sync worker will exit.
Now, for s2 again a sequence sync will start. It will give the above error.

Is this loop required? Instead we can just use a bool like
'is_any_sequence_not_ready'. Thoughts?

===== sequencesync.c =====
2. function name should be 'LogicalRepSyncSequences' instead of
'LogicalRepSyncSeqeunces'

3. In function 'LogicalRepSyncSeqeunces'
    sequencerel = table_open(seqinfo->relid, RowExclusiveLock);\
    There is a extra '\' symbol

4. In function LogicalRepSyncSeqeunces:
+ ereport(LOG,
+ errmsg("logical replication synchronization for subscription \"%s\",
sequence \"%s\" has finished",
+   get_subscription_name(subid, false), RelationGetRelationName(sequencerel)));
+ table_close(sequencerel, NoLock);
+
+ currseq++;
+
+ if (currseq % MAX_SEQUENCES_SYNC_PER_BATCH == 0 || currseq ==
list_length(sequences))
+ CommitTransactionCommand();


The above message gets logged even if the changes are not committed.
Suppose the sequence worker exits before commit due to some reason.
Thought the log will show that sequence is synced, the sequence will
be in 'init' state. I think this is not desirable.
Maybe we should log the synced sequences at commit time? Thoughts?

===== General ====
5. We can use other macros like 'foreach_ptr' instead of 'foreach'

Thanks and Regards,
Shlok Kyal



Re: Logical Replication of sequences

От
vignesh C
Дата:
On Tue, 25 Jun 2024 at 17:53, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Thu, 20 Jun 2024 at 18:24, vignesh C <vignesh21@gmail.com> wrote:
> >
> > On Wed, 19 Jun 2024 at 20:33, vignesh C <vignesh21@gmail.com> wrote:
> > >
> > > On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Agreed and I am not sure which is better because there is a value in
> > > > keeping the state name the same for both sequences and tables. We
> > > > probably need more comments in code and doc updates to make the
> > > > behavior clear. We can start with the sequence state as 'init' for
> > > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others
> > > > feel so during the review.
> > >
> > > Here is a patch which does the sequence synchronization in the
> > > following lines from the above discussion:
> > > This commit introduces sequence synchronization during 1) creation of
> > > subscription for initial sync of sequences 2) refresh publication to
> > > synchronize the sequences for the newly created sequences 3) refresh
> > > publication sequences for synchronizing all the sequences.
> > > 1) During subscription creation with CREATE SUBSCRIPTION (no syntax change):
> > >    - The subscriber retrieves sequences associated with publications.
> > >    - Sequences  are added in the 'init' state to the pg_subscription_rel table.
> > >    - Sequence synchronization worker will be started if there are any
> > > sequences to be synchronized
> > >    - A new sequence synchronization worker handles synchronization in
> > > batches of 100 sequences:
> > >      a) Retrieves sequence values using pg_sequence_state from the publisher.
> > >      b) Sets sequence values accordingly.
> > >      c) Updates sequence state to 'READY' in pg_susbcripion_rel
> > >      d) Commits batches of 100 synchronized sequences.
> > > 2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH
> > > PUBLICATION (no syntax change):
> > >    - Stale sequences are removed from pg_subscription_rel.
> > >    - Newly added sequences in the publisher are added in 'init' state
> > > to pg_subscription_rel.
> > >    - Sequence synchronization will be done by sequence sync worker as
> > > listed in subscription creation process.
> > >    - Sequence synchronization occurs for newly added sequences only.
> > > 3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION
> > > SEQUENCES  for refreshing all sequences:
> > >    - Removes stale sequences and adds newly added sequences from the
> > > publisher to pg_subscription_rel.
> > >    - Resets all sequences in pg_subscription_rel to 'init' state.
> > >    - Initiates sequence synchronization for all sequences by sequence
> > > sync worker as listed in subscription creation process.
> >
> > Here is an updated patch with a few fixes to remove an unused
> > function, changed a few references of table to sequence and added one
> > CHECK_FOR_INTERRUPTS in the sequence sync worker loop.
>
> Hi Vignesh,
>
> I have reviewed the patches and I have following comments:
>
> ===== tablesync.c ======
> 1. process_syncing_sequences_for_apply can crash with:
> 2024-06-21 15:25:17.208 IST [3681269] LOG:  logical replication apply
> worker for subscription "test1" has started
> 2024-06-21 15:28:10.127 IST [3682329] LOG:  logical replication
> sequences synchronization worker for subscription "test1" has started
> 2024-06-21 15:28:10.146 IST [3682329] LOG:  logical replication
> synchronization for subscription "test1", sequence "s1" has finished
> 2024-06-21 15:28:10.149 IST [3682329] LOG:  logical replication
> synchronization for subscription "test1", sequence "s2" has finished
> 2024-06-21 15:28:10.149 IST [3682329] LOG:  logical replication
> sequences synchronization worker for subscription "test1" has finished
> 2024-06-21 15:29:53.535 IST [3682767] LOG:  logical replication
> sequences synchronization worker for subscription "test1" has started
> TRAP: failed Assert("nestLevel > 0 && (nestLevel <= GUCNestLevel ||
> (nestLevel == GUCNestLevel + 1 && !isCommit))"), File: "guc.c", Line:
> 2273, PID: 3682767
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (ExceptionalCondition+0xbb)[0x5b2a61861c99]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (AtEOXact_GUC+0x7b)[0x5b2a618bddfa]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (RestoreUserContext+0xc7)[0x5b2a618a6937]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1ff7dfa)[0x5b2a61115dfa]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1ff7eb4)[0x5b2a61115eb4]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (SequencesyncWorkerMain+0x33)[0x5b2a61115fe7]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (BackgroundWorkerMain+0x4ad)[0x5b2a61029cae]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (postmaster_child_launch+0x236)[0x5b2a6102fb36]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1f1d12a)[0x5b2a6103b12a]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1f1df0f)[0x5b2a6103bf0f]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1f1bf71)[0x5b2a61039f71]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1f16f73)[0x5b2a61034f73]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (PostmasterMain+0x18fb)[0x5b2a61034445]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (+0x1ab1ab8)[0x5b2a60bcfab8]
> /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7b76bc629d90]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7b76bc629e40]
> postgres: logical replication sequencesync worker for subscription
> 16389 sync 0 (_start+0x25)[0x5b2a601491a5]
>
> Analysis:
> Suppose there are two sequences (s1, s2) on publisher.
> SO, during initial sync.
> in loop,
> + foreach(lc, table_states_not_ready)
>
> table_states_not_ready -> it contains both s1 and s2.
> So, for s1 a sequence sync will be started. It will sync all sequences
> and the sequence sync worker will exit.
> Now, for s2 again a sequence sync will start. It will give the above error.
>
> Is this loop required? Instead we can just use a bool like
> 'is_any_sequence_not_ready'. Thoughts?
>
> ===== sequencesync.c =====
> 2. function name should be 'LogicalRepSyncSequences' instead of
> 'LogicalRepSyncSeqeunces'
>
> 3. In function 'LogicalRepSyncSeqeunces'
>     sequencerel = table_open(seqinfo->relid, RowExclusiveLock);\
>     There is a extra '\' symbol
>
> 4. In function LogicalRepSyncSeqeunces:
> + ereport(LOG,
> + errmsg("logical replication synchronization for subscription \"%s\",
> sequence \"%s\" has finished",
> +   get_subscription_name(subid, false), RelationGetRelationName(sequencerel)));
> + table_close(sequencerel, NoLock);
> +
> + currseq++;
> +
> + if (currseq % MAX_SEQUENCES_SYNC_PER_BATCH == 0 || currseq ==
> list_length(sequences))
> + CommitTransactionCommand();
>
>
> The above message gets logged even if the changes are not committed.
> Suppose the sequence worker exits before commit due to some reason.
> Thought the log will show that sequence is synced, the sequence will
> be in 'init' state. I think this is not desirable.
> Maybe we should log the synced sequences at commit time? Thoughts?
>
> ===== General ====
> 5. We can use other macros like 'foreach_ptr' instead of 'foreach'

Thanks for the comments, the attached patch has the fixes for the same.

Regards,
Vignesh

Вложения

Re: Logical Replication of sequences

От
Peter Smith
Дата:
Here are my initial review comments for the first patch v20240625-0001.

======
General

1. Missing docs?

Section 9.17. "Sequence Manipulation Functions" [1] describes some
functions. Shouldn't your new function be documented here also?

~~~

2. Missing tests?

Shouldn't there be some test code that at least executes your new
pg_sequence_state function to verify that sane values are returned?

======
Commit Message

3.
This patch introduces new functionalities to PostgreSQL:
- pg_sequence_state allows retrieval of sequence values using LSN.
- SetSequence enables updating sequences with user-specified values.

~

3a.
I didn't understand why this says "using LSN" because IIUC 'lsn' is an
output parameter of that function. Don't you mean "... retrieval of
sequence values including LSN"?

~

3b.
Does "user-specified" make sense? Is this going to be exposed to a
user? How about just "specified"?

======
src/backend/commands/sequence.c

4. SetSequence:

+void
+SetSequence(Oid seq_relid, int64 value)

Would 'new_last_value' be a better parameter name here?

~~~

5.
This new function logic looks pretty similar to the do_setval()
function. Can you explain (maybe in the function comment) some info
about how and why it differs from that other function?

~~~

6.
I saw that RelationNeedsWAL() is called 2 times. It may make no sense,
but is it possible to assign that to a variable 1st time so you don't
need to call it 2nd time within the critical section?

~~~

NITPICK - remove junk (') char in comment

NITPICK - missing periods (.) in multi-sentence comment

~~~

7.
-read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple)
+read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple,
+    XLogRecPtr *lsn)

7a.
The existing parameters were described in the function comment. So,
the new 'lsn' parameter should be described here also.

~

7b.
Maybe the new parameter name should be 'lsn_res' or 'lsn_out' or
similar to emphasise that this is a returned value.

~~

NITPICK - tweaked comment. YMMV.

~~~

8. pg_sequence_state:

Should you give descriptions of the output parameters in the function
header comment? Otherwise, where are they described so called knows
what they mean?

~~~

NITPICK - /relid/seq_relid/

NITPICK - declare the variables in the same order as the output parameters

NITPICK - An alternative to the memset for nulls is just to use static
initialisation
"bool nulls[4] = {false, false, false, false};"

======
+extern void SetSequence(Oid seq_relid, int64 value);

9.
Would 'SetSequenceLastValue' be a better name for what this function is doing?

======

99.
See also my attached diff which is a top-up patch implementing those
nitpicks mentioned above. Please apply any of these that you agree
with.

======
[1] https://www.postgresql.org/docs/devel/functions-sequence.html

Kind Regards,
Peter Smith.
Fujitsu Australia

Вложения