Обсуждение: Logical Replication of sequences
In the past, we have discussed various approaches to replicate sequences by decoding the sequence changes from WAL. However, we faced several challenges to achieve the same, some of which are due to the non-transactional nature of sequences. The major ones were: (a) correctness of the decoding part, some of the problems were discussed at [1][2][3] (b) handling of sequences especially adding certain sequences automatically (e.g. sequences backing SERIAL/BIGSERIAL columns) for built-in logical replication is not considered in the proposed work [1] (c) there were some performance concerns in not so frequent scenarios [4] (see performance issues), we can probably deal with this by making sequences optional for builtin logical replication It could be possible that we can deal with these and any other issues with more work but as the use case for this feature is primarily major version upgrades it is not clear that we want to make such a big change to the code or are there better alternatives to achieve the same. This time at pgconf.dev (https://2024.pgconf.dev/), we discussed alternative approaches for this work which I would like to summarize. The various methods we discussed are as follows: 1. Provide a tool to copy all the sequences from publisher to subscriber. The major drawback is that users need to perform this as an additional step during the upgrade which would be inconvenient and probably not as useful as some built-in mechanism. 2. Provide a command say Alter Subscription ... Replicate Sequences (or something like that) which users can perform before shutdown of the publisher node during upgrade. This will allow copying all the sequences from the publisher node to the subscriber node directly. Similar to previous approach, this could also be inconvenient for users. 3. Replicate published sequences via walsender at the time of shutdown or incrementally while decoding checkpoint record. The two ways to achieve this are: (a) WAL log a special NOOP record just before shutting down checkpointer. Then allow the WALsender to read the sequence data and send it to the subscriber while decoding the new NOOP record. (b) Similar to the previous idea but instead of WAL logging a new record directly invokes a decoding callback after walsender receives a request to shutdown which will allow pgoutput to read and send required sequences. This approach has a drawback that we are adding more work at the time of shutdown but note that we already waits for all the WAL records to be decoded and sent before shutting down the walsender during shutdown of the node. Any other ideas? I have added the members I remember that were part of the discussion in the email. Please feel free to correct me if I have misunderstood or missed any point we talked about. Thoughts? [1] - https://www.postgresql.org/message-id/e4145f77-6f37-40e0-a770-aba359c50b93%40enterprisedb.com [2] - https://www.postgresql.org/message-id/CAA4eK1Lxt%2B5a9fA-B7FRzfd1vns%3DEwZTF5z9_xO9Ms4wsqD88Q%40mail.gmail.com [3] - https://www.postgresql.org/message-id/CAA4eK1KR4%3DyALKP0pOdVkqUwoUqD_v7oU3HzY-w0R_EBvgHL2w%40mail.gmail.com [4] - https://www.postgresql.org/message-id/12822961-b7de-9d59-dd27-2e3dc3980c7e%40enterprisedb.com -- With Regards, Amit Kapila.
Hi, On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > 3. Replicate published sequences via walsender at the time of shutdown > or incrementally while decoding checkpoint record. The two ways to > achieve this are: (a) WAL log a special NOOP record just before > shutting down checkpointer. Then allow the WALsender to read the > sequence data and send it to the subscriber while decoding the new > NOOP record. (b) Similar to the previous idea but instead of WAL > logging a new record directly invokes a decoding callback after > walsender receives a request to shutdown which will allow pgoutput to > read and send required sequences. This approach has a drawback that we > are adding more work at the time of shutdown but note that we already > waits for all the WAL records to be decoded and sent before shutting > down the walsender during shutdown of the node. Thanks. IIUC, both of the above approaches decode the sequences during only shutdown. I'm wondering, why not periodically decode and replicate the published sequences so that the decoding at the shutdown will not take that longer? I can imagine a case where there are tens of thousands of sequences in a production server, and surely decoding and sending them just during the shutdown can take a lot of time hampering the overall server uptime. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Jun 4, 2024 at 4:53 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > 3. Replicate published sequences via walsender at the time of shutdown > > or incrementally while decoding checkpoint record. The two ways to > > achieve this are: (a) WAL log a special NOOP record just before > > shutting down checkpointer. Then allow the WALsender to read the > > sequence data and send it to the subscriber while decoding the new > > NOOP record. (b) Similar to the previous idea but instead of WAL > > logging a new record directly invokes a decoding callback after > > walsender receives a request to shutdown which will allow pgoutput to > > read and send required sequences. This approach has a drawback that we > > are adding more work at the time of shutdown but note that we already > > waits for all the WAL records to be decoded and sent before shutting > > down the walsender during shutdown of the node. > > Thanks. IIUC, both of the above approaches decode the sequences during > only shutdown. I'm wondering, why not periodically decode and > replicate the published sequences so that the decoding at the shutdown > will not take that longer? > Even if we decode it periodically (say each time we decode the checkpoint record) then also we need to send the entire set of sequences at shutdown. This is because the sequences may have changed from the last time we sent them. > > I can imagine a case where there are tens > of thousands of sequences in a production server, and surely decoding > and sending them just during the shutdown can take a lot of time > hampering the overall server uptime. > It is possible but we will send only the sequences that belong to publications for which walsender is supposed to send the required data. Now, we can also imagine providing option 2 (Alter Subscription ... Replicate Sequences) so that users can replicate sequences before shutdown and then disable the subscriptions so that there won't be a corresponding walsender. -- With Regards, Amit Kapila.
On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
3. Replicate published sequences via walsender at the time of shutdown
or incrementally while decoding checkpoint record. The two ways to
achieve this are: (a) WAL log a special NOOP record just before
shutting down checkpointer. Then allow the WALsender to read the
sequence data and send it to the subscriber while decoding the new
NOOP record. (b) Similar to the previous idea but instead of WAL
logging a new record directly invokes a decoding callback after
walsender receives a request to shutdown which will allow pgoutput to
read and send required sequences. This approach has a drawback that we
are adding more work at the time of shutdown but note that we already
waits for all the WAL records to be decoded and sent before shutting
down the walsender during shutdown of the node.
Any other ideas?
In case of primary crash the sequence won't get replicated. That is true even with the previous approach in case walsender is shut down because of a crash, but it is more serious with this approach. How about periodically sending this information?
Best Wishes,
Ashutosh Bapat
On 6/4/24 06:57, Amit Kapila wrote: > 1. Provide a tool to copy all the sequences from publisher to > subscriber. The major drawback is that users need to perform this as > an additional step during the upgrade which would be inconvenient and > probably not as useful as some built-in mechanism. Agree, this requires additional steps. Not a preferred approach in my opinion. When a large set of sequences are present, it will add additional downtime for upgrade process. > 2. Provide a command say Alter Subscription ... Replicate Sequences > (or something like that) which users can perform before shutdown of > the publisher node during upgrade. This will allow copying all the > sequences from the publisher node to the subscriber node directly. > Similar to previous approach, this could also be inconvenient for > users. This is similar to option 1 except that it is a SQL command now. Still not a preferred approach in my opinion. When a large set of sequences are present, it will add additional downtime for upgrade process. > 3. Replicate published sequences via walsender at the time of shutdown > or incrementally while decoding checkpoint record. The two ways to > achieve this are: (a) WAL log a special NOOP record just before > shutting down checkpointer. Then allow the WALsender to read the > sequence data and send it to the subscriber while decoding the new > NOOP record. (b) Similar to the previous idea but instead of WAL > logging a new record directly invokes a decoding callback after > walsender receives a request to shutdown which will allow pgoutput to > read and send required sequences. This approach has a drawback that we > are adding more work at the time of shutdown but note that we already > waits for all the WAL records to be decoded and sent before shutting > down the walsender during shutdown of the node. At the time of shutdown a) most logical upgrades don't necessarily call for shutdown b) it will still add to total downtime with large set of sequences. Incremental option is better as it will not require a shutdown. I do see a scenario where sequence of events can lead to loss of sequence and generate duplicate sequence values, if subscriber starts consuming sequences while publisher is also consuming them. In such cases, subscriber shall not be allowed sequence consumption. -- Kind Regards, Yogesh Sharma Open Source Enthusiast and Advocate PostgreSQL Contributors Team @ RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Tue, Jun 4, 2024 at 7:40 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> >> 3. Replicate published sequences via walsender at the time of shutdown >> or incrementally while decoding checkpoint record. The two ways to >> achieve this are: (a) WAL log a special NOOP record just before >> shutting down checkpointer. Then allow the WALsender to read the >> sequence data and send it to the subscriber while decoding the new >> NOOP record. (b) Similar to the previous idea but instead of WAL >> logging a new record directly invokes a decoding callback after >> walsender receives a request to shutdown which will allow pgoutput to >> read and send required sequences. This approach has a drawback that we >> are adding more work at the time of shutdown but note that we already >> waits for all the WAL records to be decoded and sent before shutting >> down the walsender during shutdown of the node. >> >> Any other ideas? >> > > In case of primary crash the sequence won't get replicated. That is true even with the previous approach in case walsenderis shut down because of a crash, but it is more serious with this approach. > Right, but if we just want to support a major version upgrade scenario then this should be fine because upgrades require a clean shutdown. > How about periodically sending this information? > Now, if we want to support some sort of failover then probably this will help. Do you have that use case in mind? If we want to send periodically then we can do it when decoding checkpoint (XLOG_CHECKPOINT_ONLINE) or some other periodic WAL record like running_xacts (XLOG_RUNNING_XACTS). -- With Regards, Amit Kapila.
On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma <yogesh.sharma@catprosystems.com> wrote: > > On 6/4/24 06:57, Amit Kapila wrote: > > > 2. Provide a command say Alter Subscription ... Replicate Sequences > > (or something like that) which users can perform before shutdown of > > the publisher node during upgrade. This will allow copying all the > > sequences from the publisher node to the subscriber node directly. > > Similar to previous approach, this could also be inconvenient for > > users. > > This is similar to option 1 except that it is a SQL command now. > Right, but I would still prefer a command as it provides clear steps for the upgrade. Users need to perform (a) Replicate Sequences for a particular subscription (b) Disable that subscription (c) Perform (a) and (b) for all the subscriptions corresponding to the publisher we want to shut down for upgrade. I agree there are some manual steps involved here but it is advisable for users to ensure that they have received the required data on the subscriber before the upgrade of the publisher node, otherwise, they may not be able to continue replication after the upgrade. For example, see the "Prepare for publisher upgrades" step in pg_upgrade docs [1]. > > > 3. Replicate published sequences via walsender at the time of shutdown > > or incrementally while decoding checkpoint record. The two ways to > > achieve this are: (a) WAL log a special NOOP record just before > > shutting down checkpointer. Then allow the WALsender to read the > > sequence data and send it to the subscriber while decoding the new > > NOOP record. (b) Similar to the previous idea but instead of WAL > > logging a new record directly invokes a decoding callback after > > walsender receives a request to shutdown which will allow pgoutput to > > read and send required sequences. This approach has a drawback that we > > are adding more work at the time of shutdown but note that we already > > waits for all the WAL records to be decoded and sent before shutting > > down the walsender during shutdown of the node. > > At the time of shutdown a) most logical upgrades don't necessarily call > for shutdown > Won't the major version upgrade expect that the node is down? Refer to step "Stop both servers" in [1]. > b) it will still add to total downtime with large set of > sequences. Incremental option is better as it will not require a shutdown. > > I do see a scenario where sequence of events can lead to loss of sequence > and generate duplicate sequence values, if subscriber starts consuming > sequences while publisher is also consuming them. In such cases, subscriber > shall not be allowed sequence consumption. > It would be fine to not allow subscribers to consume sequences that are being logically replicated but what about the cases where we haven't sent the latest values of sequences before the shutdown of the publisher? In such a case, the publisher would have already consumed some values that wouldn't have been sent to the subscriber and now when the publisher is down then even if we re-allow the sequence values to be consumed from the subscriber, it can lead to duplicate values. [1] - https://www.postgresql.org/docs/devel/pgupgrade.html -- With Regards, Amit Kapila.
On 04.06.24 12:57, Amit Kapila wrote: > 2. Provide a command say Alter Subscription ... Replicate Sequences > (or something like that) which users can perform before shutdown of > the publisher node during upgrade. This will allow copying all the > sequences from the publisher node to the subscriber node directly. > Similar to previous approach, this could also be inconvenient for > users. I would start with this. In any case, you're going to need to write code to collect all the sequence values, send them over some protocol, apply them on the subscriber. The easiest way to start is to trigger that manually. Then later you can add other ways to trigger it, either by timer or around shutdown, or whatever other ideas there might be.
On Wed, Jun 5, 2024 at 9:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma > <yogesh.sharma@catprosystems.com> wrote: > > > > On 6/4/24 06:57, Amit Kapila wrote: > > > > > 2. Provide a command say Alter Subscription ... Replicate Sequences > > > (or something like that) which users can perform before shutdown of > > > the publisher node during upgrade. This will allow copying all the > > > sequences from the publisher node to the subscriber node directly. > > > Similar to previous approach, this could also be inconvenient for > > > users. > > > > This is similar to option 1 except that it is a SQL command now. > > > > Right, but I would still prefer a command as it provides clear steps > for the upgrade. Users need to perform (a) Replicate Sequences for a > particular subscription (b) Disable that subscription (c) Perform (a) > and (b) for all the subscriptions corresponding to the publisher we > want to shut down for upgrade. > Another advantage of this approach over just a plain tool to copy all sequences before upgrade is that here we can have the facility to copy just the required sequences. I mean the set sequences that the user has specified as part of the publication. -- With Regards, Amit Kapila.
Hi, On Tue, Jun 4, 2024 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > Even if we decode it periodically (say each time we decode the > checkpoint record) then also we need to send the entire set of > sequences at shutdown. This is because the sequences may have changed > from the last time we sent them. Agree. How about decoding and sending only the sequences that are changed from the last time when they were sent? I know it requires a bit of tracking and more work, but all I'm looking for is to reduce the amount of work that walsenders need to do during the shutdown. Having said that, I like the idea of letting the user sync the sequences via ALTER SUBSCRIPTION command and not weave the logic into the shutdown checkpoint path. As Peter Eisentraut said here https://www.postgresql.org/message-id/42e5cb35-4aeb-4f58-8091-90619c7c3ecc%40eisentraut.org, this can be a good starting point to get going. > > I can imagine a case where there are tens > > of thousands of sequences in a production server, and surely decoding > > and sending them just during the shutdown can take a lot of time > > hampering the overall server uptime. > > It is possible but we will send only the sequences that belong to > publications for which walsender is supposed to send the required > data. Right, but what if all the publication tables can have tens of thousands of sequences. > Now, we can also imagine providing option 2 (Alter Subscription > ... Replicate Sequences) so that users can replicate sequences before > shutdown and then disable the subscriptions so that there won't be a > corresponding walsender. As stated above, I like this idea to start with. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Wed, Jun 5, 2024 at 12:51 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > On 04.06.24 12:57, Amit Kapila wrote: > > 2. Provide a command say Alter Subscription ... Replicate Sequences > > (or something like that) which users can perform before shutdown of > > the publisher node during upgrade. This will allow copying all the > > sequences from the publisher node to the subscriber node directly. > > Similar to previous approach, this could also be inconvenient for > > users. > > I would start with this. In any case, you're going to need to write > code to collect all the sequence values, send them over some protocol, > apply them on the subscriber. The easiest way to start is to trigger > that manually. Then later you can add other ways to trigger it, either > by timer or around shutdown, or whatever other ideas there might be. > Agreed. To achieve this, we can allow sequences to be copied during the initial CREATE SUBSCRIPTION command similar to what we do for tables. And then later by new/existing command, we re-copy the already existing sequences on the subscriber. The options for the new command could be: Alter Subscription ... Refresh Sequences Alter Subscription ... Replicate Sequences In the second option, we need to introduce a new keyword Replicate. Can you think of any better option? In addition to the above, the command Alter Subscription .. Refresh Publication will fetch any missing sequences similar to what it does for tables. Thoughts? -- With Regards, Amit Kapila.
On Wed, Jun 5, 2024 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 4, 2024 at 7:40 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Jun 4, 2024 at 4:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>>
>> 3. Replicate published sequences via walsender at the time of shutdown
>> or incrementally while decoding checkpoint record. The two ways to
>> achieve this are: (a) WAL log a special NOOP record just before
>> shutting down checkpointer. Then allow the WALsender to read the
>> sequence data and send it to the subscriber while decoding the new
>> NOOP record. (b) Similar to the previous idea but instead of WAL
>> logging a new record directly invokes a decoding callback after
>> walsender receives a request to shutdown which will allow pgoutput to
>> read and send required sequences. This approach has a drawback that we
>> are adding more work at the time of shutdown but note that we already
>> waits for all the WAL records to be decoded and sent before shutting
>> down the walsender during shutdown of the node.
>>
>> Any other ideas?
>>
>
> In case of primary crash the sequence won't get replicated. That is true even with the previous approach in case walsender is shut down because of a crash, but it is more serious with this approach.
>
Right, but if we just want to support a major version upgrade scenario
then this should be fine because upgrades require a clean shutdown.
>
How about periodically sending this information?
>
Now, if we want to support some sort of failover then probably this
will help. Do you have that use case in mind?
Regular failover was a goal for supporting logical replication of sequences. That might be more common than major upgrade scenario.
If we want to send
periodically then we can do it when decoding checkpoint
(XLOG_CHECKPOINT_ONLINE) or some other periodic WAL record like
running_xacts (XLOG_RUNNING_XACTS).
Yeah. I am thinking along those lines.
It must be noted, however, that none of those optional make sure that the replicated sequence's states are consistent with the replicated object state which use those sequences. E.g. table t1 uses sequence s1. By last sequence replication, as of time T1, let's say t1 had consumed values upto vl1 from s1. But later, by time T2, it consumed values upto vl2 which were not replicated but the changes to t1 by T2 were replicated. If failover happens at that point, INSERTs on t1 would fail because of duplicate keys (values between vl1 and vl2). Previous attempt to support logical sequence replication solved this problem by replicating a future state of sequences (current value +/- log count). Similarly, if the sequence was ALTERed between T1 and T2, the state of sequence on replica would be inconsistent with the state of t1. Failing over at this stage, might end t1 in an inconsistent state.
Best Wishes,
Ashutosh Bapat
On Wed, Jun 5, 2024 at 6:01 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Wed, Jun 5, 2024 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> How about periodically sending this information? >> > >> >> Now, if we want to support some sort of failover then probably this >> will help. Do you have that use case in mind? > > > Regular failover was a goal for supporting logical replication of sequences. That might be more common than major upgradescenario. > We can't support regular failovers to subscribers unless we can replicate/copy slots because the existing nodes connected to the current publisher/primary would expect that. It should be primarily useful for major version upgrades at this stage. >> >> If we want to send >> periodically then we can do it when decoding checkpoint >> (XLOG_CHECKPOINT_ONLINE) or some other periodic WAL record like >> running_xacts (XLOG_RUNNING_XACTS). >> > > Yeah. I am thinking along those lines. > > It must be noted, however, that none of those optional make sure that the replicated sequence's states are consistent withthe replicated object state which use those sequences. > Right, I feel as others are advocating, it seems better to support it manually via command and then later we can extend it to do at shutdown or at some regular intervals. If we do that then we should be able to support major version upgrades and planned switchover. -- With Regards, Amit Kapila.
On Wed, Jun 5, 2024 at 12:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma > <yogesh.sharma@catprosystems.com> wrote: > > > > On 6/4/24 06:57, Amit Kapila wrote: > > > > > 2. Provide a command say Alter Subscription ... Replicate Sequences > > > (or something like that) which users can perform before shutdown of > > > the publisher node during upgrade. This will allow copying all the > > > sequences from the publisher node to the subscriber node directly. > > > Similar to previous approach, this could also be inconvenient for > > > users. > > > > This is similar to option 1 except that it is a SQL command now. > > > > Right, but I would still prefer a command as it provides clear steps > for the upgrade. Users need to perform (a) Replicate Sequences for a > particular subscription (b) Disable that subscription (c) Perform (a) > and (b) for all the subscriptions corresponding to the publisher we > want to shut down for upgrade. > > I agree there are some manual steps involved here but it is advisable > for users to ensure that they have received the required data on the > subscriber before the upgrade of the publisher node, otherwise, they > may not be able to continue replication after the upgrade. For > example, see the "Prepare for publisher upgrades" step in pg_upgrade > docs [1]. > > > > > > 3. Replicate published sequences via walsender at the time of shutdown > > > or incrementally while decoding checkpoint record. The two ways to > > > achieve this are: (a) WAL log a special NOOP record just before > > > shutting down checkpointer. Then allow the WALsender to read the > > > sequence data and send it to the subscriber while decoding the new > > > NOOP record. (b) Similar to the previous idea but instead of WAL > > > logging a new record directly invokes a decoding callback after > > > walsender receives a request to shutdown which will allow pgoutput to > > > read and send required sequences. This approach has a drawback that we > > > are adding more work at the time of shutdown but note that we already > > > waits for all the WAL records to be decoded and sent before shutting > > > down the walsender during shutdown of the node. > > > > At the time of shutdown a) most logical upgrades don't necessarily call > > for shutdown > > > > Won't the major version upgrade expect that the node is down? Refer to > step "Stop both servers" in [1]. I think the idea is that the publisher is the old version and the subscriber is the new version, and changes generated on the publisher are replicated to the subscriber via logical replication. And at some point, we change the application (or a router) settings so that no more transactions come to the publisher, do the last upgrade preparation work (e.g. copying the latest sequence values if requried), and then change the application so that new transactions come to the subscriber. I remember the blog post about Knock doing a similar process to upgrade the clusters with minimal downtime[1]. Regards, [1] https://knock.app/blog/zero-downtime-postgres-upgrades -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Wed, Jun 5, 2024 at 3:17 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Tue, Jun 4, 2024 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Even if we decode it periodically (say each time we decode the > > checkpoint record) then also we need to send the entire set of > > sequences at shutdown. This is because the sequences may have changed > > from the last time we sent them. > > Agree. How about decoding and sending only the sequences that are > changed from the last time when they were sent? I know it requires a > bit of tracking and more work, but all I'm looking for is to reduce > the amount of work that walsenders need to do during the shutdown. > I see your point but going towards tracking the changed sequences sounds like moving towards what we do for incremental backups unless we can invent some other smart way. > Having said that, I like the idea of letting the user sync the > sequences via ALTER SUBSCRIPTION command and not weave the logic into > the shutdown checkpoint path. As Peter Eisentraut said here > https://www.postgresql.org/message-id/42e5cb35-4aeb-4f58-8091-90619c7c3ecc%40eisentraut.org, > this can be a good starting point to get going. > Agreed. > > > I can imagine a case where there are tens > > > of thousands of sequences in a production server, and surely decoding > > > and sending them just during the shutdown can take a lot of time > > > hampering the overall server uptime. > > > > It is possible but we will send only the sequences that belong to > > publications for which walsender is supposed to send the required > > data. > > Right, but what if all the publication tables can have tens of > thousands of sequences. > In such cases we have no option but to send all the sequences. > > Now, we can also imagine providing option 2 (Alter Subscription > > ... Replicate Sequences) so that users can replicate sequences before > > shutdown and then disable the subscriptions so that there won't be a > > corresponding walsender. > > As stated above, I like this idea to start with. > +1. -- With Regards, Amit Kapila.
On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 5, 2024 at 12:51 PM Peter Eisentraut <peter@eisentraut.org> wrote: > > > > On 04.06.24 12:57, Amit Kapila wrote: > > > 2. Provide a command say Alter Subscription ... Replicate Sequences > > > (or something like that) which users can perform before shutdown of > > > the publisher node during upgrade. This will allow copying all the > > > sequences from the publisher node to the subscriber node directly. > > > Similar to previous approach, this could also be inconvenient for > > > users. > > > > I would start with this. In any case, you're going to need to write > > code to collect all the sequence values, send them over some protocol, > > apply them on the subscriber. The easiest way to start is to trigger > > that manually. Then later you can add other ways to trigger it, either > > by timer or around shutdown, or whatever other ideas there might be. > > > > Agreed. +1 > To achieve this, we can allow sequences to be copied during > the initial CREATE SUBSCRIPTION command similar to what we do for > tables. And then later by new/existing command, we re-copy the already > existing sequences on the subscriber. > > The options for the new command could be: > Alter Subscription ... Refresh Sequences > Alter Subscription ... Replicate Sequences > > In the second option, we need to introduce a new keyword Replicate. > Can you think of any better option? Another idea is doing that using options. For example, For initial sequences synchronization: CREATE SUBSCRIPTION ... WITH (copy_sequence = true); For re-copy (or update) sequences: ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true); > > In addition to the above, the command Alter Subscription .. Refresh > Publication will fetch any missing sequences similar to what it does > for tables. On the subscriber side, do we need to track which sequences are created via CREATE/ALTER SUBSCRIPTION? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Thu, Jun 6, 2024 at 9:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Jun 5, 2024 at 12:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma > > <yogesh.sharma@catprosystems.com> wrote: > > > > > > On 6/4/24 06:57, Amit Kapila wrote: > > > > > > > 2. Provide a command say Alter Subscription ... Replicate Sequences > > > > (or something like that) which users can perform before shutdown of > > > > the publisher node during upgrade. This will allow copying all the > > > > sequences from the publisher node to the subscriber node directly. > > > > Similar to previous approach, this could also be inconvenient for > > > > users. > > > > > > This is similar to option 1 except that it is a SQL command now. > > > > > > > Right, but I would still prefer a command as it provides clear steps > > for the upgrade. Users need to perform (a) Replicate Sequences for a > > particular subscription (b) Disable that subscription (c) Perform (a) > > and (b) for all the subscriptions corresponding to the publisher we > > want to shut down for upgrade. > > > > I agree there are some manual steps involved here but it is advisable > > for users to ensure that they have received the required data on the > > subscriber before the upgrade of the publisher node, otherwise, they > > may not be able to continue replication after the upgrade. For > > example, see the "Prepare for publisher upgrades" step in pg_upgrade > > docs [1]. > > > > > > > > > 3. Replicate published sequences via walsender at the time of shutdown > > > > or incrementally while decoding checkpoint record. The two ways to > > > > achieve this are: (a) WAL log a special NOOP record just before > > > > shutting down checkpointer. Then allow the WALsender to read the > > > > sequence data and send it to the subscriber while decoding the new > > > > NOOP record. (b) Similar to the previous idea but instead of WAL > > > > logging a new record directly invokes a decoding callback after > > > > walsender receives a request to shutdown which will allow pgoutput to > > > > read and send required sequences. This approach has a drawback that we > > > > are adding more work at the time of shutdown but note that we already > > > > waits for all the WAL records to be decoded and sent before shutting > > > > down the walsender during shutdown of the node. > > > > > > At the time of shutdown a) most logical upgrades don't necessarily call > > > for shutdown > > > > > > > Won't the major version upgrade expect that the node is down? Refer to > > step "Stop both servers" in [1]. > > I think the idea is that the publisher is the old version and the > subscriber is the new version, and changes generated on the publisher > are replicated to the subscriber via logical replication. And at some > point, we change the application (or a router) settings so that no > more transactions come to the publisher, do the last upgrade > preparation work (e.g. copying the latest sequence values if > requried), and then change the application so that new transactions > come to the subscriber. > Okay, thanks for sharing the exact steps. If one has to follow that path then sending incrementally (at checkpoint WAL or other times) won't work because we want to ensure that the sequences are up-to-date before the application starts using the new database. To do that in a bullet-proof way, one has to copy/replicate sequences during the requests to the new database are paused (Reference from the blog you shared: For the first second after flipping the flag, our application artificially paused any new database requests for one second.). Currently, they are using some guesswork to replicate sequences that require manual verification and more manual work for each sequence. The new command (Alter Subscription ... Replicate Sequence) should ease their procedure and can do things where they would require no or very less verification. > I remember the blog post about Knock doing a similar process to > upgrade the clusters with minimal downtime[1]. > Thanks for sharing the blog post. -- With Regards, Amit Kapila.
On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > To achieve this, we can allow sequences to be copied during > > the initial CREATE SUBSCRIPTION command similar to what we do for > > tables. And then later by new/existing command, we re-copy the already > > existing sequences on the subscriber. > > > > The options for the new command could be: > > Alter Subscription ... Refresh Sequences > > Alter Subscription ... Replicate Sequences > > > > In the second option, we need to introduce a new keyword Replicate. > > Can you think of any better option? > > Another idea is doing that using options. For example, > > For initial sequences synchronization: > > CREATE SUBSCRIPTION ... WITH (copy_sequence = true); > How will it interact with the existing copy_data option? So copy_data will become equivalent to copy_table_data, right? > For re-copy (or update) sequences: > > ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true); > Similar to the previous point it can be slightly confusing w.r.t copy_data. And would copy_sequence here mean that it would copy sequence values of both pre-existing and newly added sequences, if so, that would make it behave differently than copy_data? The other possibility in this direction would be to introduce an option like replicate_all_sequences/copy_all_sequences which indicates a copy of both pre-existing and new sequences, if any. If we want to go in the direction of having an option such as copy_(all)_sequences then do you think specifying that copy_data is just for tables in the docs would be sufficient? I am afraid that it would be confusing for users. > > > > In addition to the above, the command Alter Subscription .. Refresh > > Publication will fetch any missing sequences similar to what it does > > for tables. > > On the subscriber side, do we need to track which sequences are > created via CREATE/ALTER SUBSCRIPTION? > I think so unless we find some other way to know at refresh publication time which all new sequences need to be part of the subscription. What should be the behavior w.r.t sequences when the user performs ALTER SUBSCRIPTION ... REFRESH PUBLICATION? I was thinking similar to tables, it should fetch any missing sequence information from the publisher. -- With Regards, Amit Kapila.
On Thu, Jun 6, 2024 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jun 5, 2024 at 6:01 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>>
>> How about periodically sending this information?
>> >
>>
>> Now, if we want to support some sort of failover then probably this
>> will help. Do you have that use case in mind?
>
>
> Regular failover was a goal for supporting logical replication of sequences. That might be more common than major upgrade scenario.
>
We can't support regular failovers to subscribers unless we can
replicate/copy slots because the existing nodes connected to the
current publisher/primary would expect that. It should be primarily
useful for major version upgrades at this stage.
We don't want to design it in a way that requires major rework when we are able to copy slots and then support regular failovers. That's when the consistency between a sequence and the table using it would be a must. So it's better that we take that into consideration now.
Best Wishes,
Ashutosh Bapat
On Thu, Jun 6, 2024 at 3:44 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: > > On Thu, Jun 6, 2024 at 9:22 AM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Wed, Jun 5, 2024 at 6:01 PM Ashutosh Bapat >> <ashutosh.bapat.oss@gmail.com> wrote: >> > >> > On Wed, Jun 5, 2024 at 8:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> >> >> How about periodically sending this information? >> >> > >> >> >> >> Now, if we want to support some sort of failover then probably this >> >> will help. Do you have that use case in mind? >> > >> > >> > Regular failover was a goal for supporting logical replication of sequences. That might be more common than major upgradescenario. >> > >> >> We can't support regular failovers to subscribers unless we can >> replicate/copy slots because the existing nodes connected to the >> current publisher/primary would expect that. It should be primarily >> useful for major version upgrades at this stage. > > > We don't want to design it in a way that requires major rework when we are able to copy slots and then support regularfailover. > I don't think we can just copy slots like we do for standbys. The slots would require WAL locations to continue, so not sure if we can make it work for failover for subscribers. > That's when the consistency between a sequence and the table using it would be a must. So it's better that we take that into consideration now. > With the ideas being discussed here, I could only see the use case of a major version upgrade or planned switchover to work. If we come up with any other agreeable way that is better than this then we can consider the same. -- With Regards, Amit Kapila.
On Thu, Jun 6, 2024 at 9:34 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 5, 2024 at 3:17 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Tue, Jun 4, 2024 at 5:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Even if we decode it periodically (say each time we decode the > > > checkpoint record) then also we need to send the entire set of > > > sequences at shutdown. This is because the sequences may have changed > > > from the last time we sent them. > > > > Agree. How about decoding and sending only the sequences that are > > changed from the last time when they were sent? I know it requires a > > bit of tracking and more work, but all I'm looking for is to reduce > > the amount of work that walsenders need to do during the shutdown. > > > > I see your point but going towards tracking the changed sequences > sounds like moving towards what we do for incremental backups unless > we can invent some other smart way. Yes, we would need an entirely new infrastructure to track the sequence change since the last sync. We can only determine this from WAL, and relying on it would somehow bring us back to the approach we were trying to achieve with logical decoding of sequences patch. > > Having said that, I like the idea of letting the user sync the > > sequences via ALTER SUBSCRIPTION command and not weave the logic into > > the shutdown checkpoint path. As Peter Eisentraut said here > > https://www.postgresql.org/message-id/42e5cb35-4aeb-4f58-8091-90619c7c3ecc%40eisentraut.org, > > this can be a good starting point to get going. > > > > Agreed. +1 -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Thu, Jun 6, 2024 at 6:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > To achieve this, we can allow sequences to be copied during > > > the initial CREATE SUBSCRIPTION command similar to what we do for > > > tables. And then later by new/existing command, we re-copy the already > > > existing sequences on the subscriber. > > > > > > The options for the new command could be: > > > Alter Subscription ... Refresh Sequences > > > Alter Subscription ... Replicate Sequences > > > > > > In the second option, we need to introduce a new keyword Replicate. > > > Can you think of any better option? > > > > Another idea is doing that using options. For example, > > > > For initial sequences synchronization: > > > > CREATE SUBSCRIPTION ... WITH (copy_sequence = true); > > > > How will it interact with the existing copy_data option? So copy_data > will become equivalent to copy_table_data, right? Right. > > > For re-copy (or update) sequences: > > > > ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true); > > > > Similar to the previous point it can be slightly confusing w.r.t > copy_data. And would copy_sequence here mean that it would copy > sequence values of both pre-existing and newly added sequences, if so, > that would make it behave differently than copy_data? The other > possibility in this direction would be to introduce an option like > replicate_all_sequences/copy_all_sequences which indicates a copy of > both pre-existing and new sequences, if any. Copying sequence data works differently than replicating table data (initial data copy and logical replication). So I thought the copy_sequence option (or whatever better name) always does both updating pre-existing sequences and adding new sequences. REFRESH PUBLICATION updates the tables to be subscribed, so we also update or add sequences associated to these tables. > > If we want to go in the direction of having an option such as > copy_(all)_sequences then do you think specifying that copy_data is > just for tables in the docs would be sufficient? I am afraid that it > would be confusing for users. I see your point. But I guess it would not be very problematic as it doesn't break the current behavior and copy_(all)_sequences is primarily for upgrade use cases. > > > > > > > In addition to the above, the command Alter Subscription .. Refresh > > > Publication will fetch any missing sequences similar to what it does > > > for tables. > > > > On the subscriber side, do we need to track which sequences are > > created via CREATE/ALTER SUBSCRIPTION? > > > > I think so unless we find some other way to know at refresh > publication time which all new sequences need to be part of the > subscription. What should be the behavior w.r.t sequences when the > user performs ALTER SUBSCRIPTION ... REFRESH PUBLICATION? I was > thinking similar to tables, it should fetch any missing sequence > information from the publisher. It seems to make sense to me. But I have one question: do we want to support replicating sequences that are not associated with any tables? if yes, what if we refresh two different subscriptions that subscribe to different tables on the same database? On the other hand, if no (i.e. replicating only sequences owned by tables), can we know which sequences to replicate by checking the subscribed tables? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Fri, Jun 7, 2024 at 7:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Jun 6, 2024 at 6:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > To achieve this, we can allow sequences to be copied during > > > > the initial CREATE SUBSCRIPTION command similar to what we do for > > > > tables. And then later by new/existing command, we re-copy the already > > > > existing sequences on the subscriber. > > > > > > > > The options for the new command could be: > > > > Alter Subscription ... Refresh Sequences > > > > Alter Subscription ... Replicate Sequences > > > > > > > > In the second option, we need to introduce a new keyword Replicate. > > > > Can you think of any better option? > > > > > > Another idea is doing that using options. For example, > > > > > > For initial sequences synchronization: > > > > > > CREATE SUBSCRIPTION ... WITH (copy_sequence = true); > > > > > > > How will it interact with the existing copy_data option? So copy_data > > will become equivalent to copy_table_data, right? > > Right. > > > > > > For re-copy (or update) sequences: > > > > > > ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true); > > > > > > > Similar to the previous point it can be slightly confusing w.r.t > > copy_data. And would copy_sequence here mean that it would copy > > sequence values of both pre-existing and newly added sequences, if so, > > that would make it behave differently than copy_data? The other > > possibility in this direction would be to introduce an option like > > replicate_all_sequences/copy_all_sequences which indicates a copy of > > both pre-existing and new sequences, if any. > > Copying sequence data works differently than replicating table data > (initial data copy and logical replication). So I thought the > copy_sequence option (or whatever better name) always does both > updating pre-existing sequences and adding new sequences. REFRESH > PUBLICATION updates the tables to be subscribed, so we also update or > add sequences associated to these tables. > Are you imagining the behavior for sequences associated with tables differently than the ones defined by the CREATE SEQUENCE .. command? I was thinking that users would associate sequences with publications similar to what we do for tables for both cases. For example, they need to explicitly mention the sequences they want to replicate by commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR SEQUENCES IN SCHEMA sch1; In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1 should copy both the explicitly defined sequences and sequences defined with the tables. Do you think a different variant for just copying sequences implicitly associated with tables (say for identity columns)? > > > > > > > > > > > In addition to the above, the command Alter Subscription .. Refresh > > > > Publication will fetch any missing sequences similar to what it does > > > > for tables. > > > > > > On the subscriber side, do we need to track which sequences are > > > created via CREATE/ALTER SUBSCRIPTION? > > > > > > > I think so unless we find some other way to know at refresh > > publication time which all new sequences need to be part of the > > subscription. What should be the behavior w.r.t sequences when the > > user performs ALTER SUBSCRIPTION ... REFRESH PUBLICATION? I was > > thinking similar to tables, it should fetch any missing sequence > > information from the publisher. > > It seems to make sense to me. But I have one question: do we want to > support replicating sequences that are not associated with any tables? > Yes, unless we see a problem with it. > if yes, what if we refresh two different subscriptions that subscribe > to different tables on the same database? What problem do you see with it? > On the other hand, if no > (i.e. replicating only sequences owned by tables), can we know which > sequences to replicate by checking the subscribed tables? > Sorry, I didn't understand your question. Can you please try to explain in more words or use some examples? -- With Regards, Amit Kapila.
On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 5, 2024 at 9:13 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Tue, Jun 4, 2024 at 8:56 PM Yogesh Sharma > > <yogesh.sharma@catprosystems.com> wrote: > > > > > > On 6/4/24 06:57, Amit Kapila wrote: > > > > > > > 2. Provide a command say Alter Subscription ... Replicate Sequences > > > > (or something like that) which users can perform before shutdown of > > > > the publisher node during upgrade. This will allow copying all the > > > > sequences from the publisher node to the subscriber node directly. > > > > Similar to previous approach, this could also be inconvenient for > > > > users. > > > > > > This is similar to option 1 except that it is a SQL command now. > > > > > > > Right, but I would still prefer a command as it provides clear steps > > for the upgrade. Users need to perform (a) Replicate Sequences for a > > particular subscription (b) Disable that subscription (c) Perform (a) > > and (b) for all the subscriptions corresponding to the publisher we > > want to shut down for upgrade. > > > > Another advantage of this approach over just a plain tool to copy all > sequences before upgrade is that here we can have the facility to copy > just the required sequences. I mean the set sequences that the user > has specified as part of the publication. Here is a WIP patch to handle synchronizing the sequence during create/alter subscription. The following changes were made for it: Subscriber modifications: Enable sequence synchronization during subscription creation or alteration using the following syntax: CREATE SUBSCRIPTION ... WITH (sequences=true); When a subscription is created with the sequence option enabled, the sequence list from the specified publications in the subscription will be retrieved from the publisher. Each sequence's data will then be copied from the remote publisher sequence to the local subscriber sequence by using a wal receiver connection. Since all of the sequence updating is done within a single transaction, if any errors occur during the copying process, the entire transaction will be rolled back. To refresh sequences, use the syntax: ALTER SUBSCRIPTION REFRESH SEQUENCES; During sequence refresh, the sequence list is updated by removing stale sequences and adding any missing sequences. The updated sequence list is then re-synchronized. A new catalog table, pg_subscription_seq, has been introduced for mapping subscriptions to sequences. Additionally, the sequence LSN (Log Sequence Number) is stored, facilitating determination of sequence changes occurring before or after the returned sequence state. I have taken some code changes from Tomas's patch at [1]. I'll adjust the syntax as needed based on the ongoing discussion at [2]. [1] - https://www.postgresql.org/message-id/09613730-5ee9-4cc3-82d8-f089be90aa64%40enterprisedb.com [2] - https://www.postgresql.org/message-id/CAA4eK1K2X%2BPaErtGVQPD0k_5XqxjV_Cwg37%2B-pWsmKFncwc7Wg%40mail.gmail.com Regards, Vignesh
Вложения
On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Jun 7, 2024 at 7:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Jun 6, 2024 at 6:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > To achieve this, we can allow sequences to be copied during > > > > > the initial CREATE SUBSCRIPTION command similar to what we do for > > > > > tables. And then later by new/existing command, we re-copy the already > > > > > existing sequences on the subscriber. > > > > > > > > > > The options for the new command could be: > > > > > Alter Subscription ... Refresh Sequences > > > > > Alter Subscription ... Replicate Sequences > > > > > > > > > > In the second option, we need to introduce a new keyword Replicate. > > > > > Can you think of any better option? > > > > > > > > Another idea is doing that using options. For example, > > > > > > > > For initial sequences synchronization: > > > > > > > > CREATE SUBSCRIPTION ... WITH (copy_sequence = true); > > > > > > > > > > How will it interact with the existing copy_data option? So copy_data > > > will become equivalent to copy_table_data, right? > > > > Right. > > > > > > > > > For re-copy (or update) sequences: > > > > > > > > ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true); > > > > > > > > > > Similar to the previous point it can be slightly confusing w.r.t > > > copy_data. And would copy_sequence here mean that it would copy > > > sequence values of both pre-existing and newly added sequences, if so, > > > that would make it behave differently than copy_data? The other > > > possibility in this direction would be to introduce an option like > > > replicate_all_sequences/copy_all_sequences which indicates a copy of > > > both pre-existing and new sequences, if any. > > > > Copying sequence data works differently than replicating table data > > (initial data copy and logical replication). So I thought the > > copy_sequence option (or whatever better name) always does both > > updating pre-existing sequences and adding new sequences. REFRESH > > PUBLICATION updates the tables to be subscribed, so we also update or > > add sequences associated to these tables. > > > > Are you imagining the behavior for sequences associated with tables > differently than the ones defined by the CREATE SEQUENCE .. command? I > was thinking that users would associate sequences with publications > similar to what we do for tables for both cases. For example, they > need to explicitly mention the sequences they want to replicate by > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR > SEQUENCES IN SCHEMA sch1; > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1 > should copy both the explicitly defined sequences and sequences > defined with the tables. Do you think a different variant for just > copying sequences implicitly associated with tables (say for identity > columns)? Oh, I was thinking that your proposal was to copy literally all sequences by REPLICA/REFRESH SEQUENCE command. But it seems to make sense to explicitly specify the sequences they want to replicate. It also means that they can create a publication that has only sequences. In this case, even if they create a subscription for that publication, we don't launch any apply workers for that subscription. Right? Also, given that the main use case (at least as the first step) is version upgrade, do we really need to support SEQUENCES IN SCHEMA and even FOR SEQUENCE? The WIP patch Vignesh recently submitted is more than 6k lines. I think we can cut the scope for the first implementation so as to make the review easy. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Sat, Jun 8, 2024 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
[...]
A new catalog table, pg_subscription_seq, has been introduced for
mapping subscriptions to sequences. Additionally, the sequence LSN
(Log Sequence Number) is stored, facilitating determination of
sequence changes occurring before or after the returned sequence
state.
Can't it be done using pg_depend? It seems a bit excessive unless I'm missing
something. How do you track sequence mapping with the publication?
something. How do you track sequence mapping with the publication?
Regards,
Amul
On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Jun 7, 2024 at 7:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Thu, Jun 6, 2024 at 6:40 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Thu, Jun 6, 2024 at 11:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > On Wed, Jun 5, 2024 at 9:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > > > To achieve this, we can allow sequences to be copied during > > > > > > the initial CREATE SUBSCRIPTION command similar to what we do for > > > > > > tables. And then later by new/existing command, we re-copy the already > > > > > > existing sequences on the subscriber. > > > > > > > > > > > > The options for the new command could be: > > > > > > Alter Subscription ... Refresh Sequences > > > > > > Alter Subscription ... Replicate Sequences > > > > > > > > > > > > In the second option, we need to introduce a new keyword Replicate. > > > > > > Can you think of any better option? > > > > > > > > > > Another idea is doing that using options. For example, > > > > > > > > > > For initial sequences synchronization: > > > > > > > > > > CREATE SUBSCRIPTION ... WITH (copy_sequence = true); > > > > > > > > > > > > > How will it interact with the existing copy_data option? So copy_data > > > > will become equivalent to copy_table_data, right? > > > > > > Right. > > > > > > > > > > > > For re-copy (or update) sequences: > > > > > > > > > > ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (copy_sequence = true); > > > > > > > > > > > > > Similar to the previous point it can be slightly confusing w.r.t > > > > copy_data. And would copy_sequence here mean that it would copy > > > > sequence values of both pre-existing and newly added sequences, if so, > > > > that would make it behave differently than copy_data? The other > > > > possibility in this direction would be to introduce an option like > > > > replicate_all_sequences/copy_all_sequences which indicates a copy of > > > > both pre-existing and new sequences, if any. > > > > > > Copying sequence data works differently than replicating table data > > > (initial data copy and logical replication). So I thought the > > > copy_sequence option (or whatever better name) always does both > > > updating pre-existing sequences and adding new sequences. REFRESH > > > PUBLICATION updates the tables to be subscribed, so we also update or > > > add sequences associated to these tables. > > > > > > > Are you imagining the behavior for sequences associated with tables > > differently than the ones defined by the CREATE SEQUENCE .. command? I > > was thinking that users would associate sequences with publications > > similar to what we do for tables for both cases. For example, they > > need to explicitly mention the sequences they want to replicate by > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR > > SEQUENCES IN SCHEMA sch1; > > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1 > > should copy both the explicitly defined sequences and sequences > > defined with the tables. Do you think a different variant for just > > copying sequences implicitly associated with tables (say for identity > > columns)? > > Oh, I was thinking that your proposal was to copy literally all > sequences by REPLICA/REFRESH SEQUENCE command. But it seems to make > sense to explicitly specify the sequences they want to replicate. It > also means that they can create a publication that has only sequences. > In this case, even if they create a subscription for that publication, > we don't launch any apply workers for that subscription. Right? > > Also, given that the main use case (at least as the first step) is > version upgrade, do we really need to support SEQUENCES IN SCHEMA and > even FOR SEQUENCE? Also, I guess that specifying individual sequences might not be easy to use for users in some cases. For sequences owned by a column of a table, users might want to specify them altogether, rather than separately. For example, CREATE PUBLICATION ... FOR TABLE tab1 WITH SEQUENCES means to add the table tab1 and its sequences to the publication. For other sequences (i.e., not owned by any tables), users might want to specify them individually. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > Are you imagining the behavior for sequences associated with tables > > > differently than the ones defined by the CREATE SEQUENCE .. command? I > > > was thinking that users would associate sequences with publications > > > similar to what we do for tables for both cases. For example, they > > > need to explicitly mention the sequences they want to replicate by > > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE > > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR > > > SEQUENCES IN SCHEMA sch1; > > > > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1 > > > should copy both the explicitly defined sequences and sequences > > > defined with the tables. Do you think a different variant for just > > > copying sequences implicitly associated with tables (say for identity > > > columns)? > > > > Oh, I was thinking that your proposal was to copy literally all > > sequences by REPLICA/REFRESH SEQUENCE command. > > I am trying to keep the behavior as close to tables as possible. > > But it seems to make > > sense to explicitly specify the sequences they want to replicate. It > > also means that they can create a publication that has only sequences. > > In this case, even if they create a subscription for that publication, > > we don't launch any apply workers for that subscription. Right? > > Right, good point. I had not thought about this. > > Also, given that the main use case (at least as the first step) is > > version upgrade, do we really need to support SEQUENCES IN SCHEMA and > > even FOR SEQUENCE? > At the very least, we can split the patch to move these variants to a separate patch. Once the main patch is finalized, we can try to evaluate the remaining separately. > Also, I guess that specifying individual sequences might not be easy > to use for users in some cases. For sequences owned by a column of a > table, users might want to specify them altogether, rather than > separately. For example, CREATE PUBLICATION ... FOR TABLE tab1 WITH > SEQUENCES means to add the table tab1 and its sequences to the > publication. For other sequences (i.e., not owned by any tables), > users might want to specify them individually. > Yeah, or we can have a syntax like CREATE PUBLICATION ... FOR TABLE tab1 INCLUDE SEQUENCES. Normally, we use the WITH clause for options (For example, CREATE SUBSCRIPTION ... WITH (streaming=...)). -- With Regards, Amit Kapila.
On Mon, 10 Jun 2024 at 12:24, Amul Sul <sulamul@gmail.com> wrote: > > > > On Sat, Jun 8, 2024 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote: >> >> On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote: >> [...] >> A new catalog table, pg_subscription_seq, has been introduced for >> mapping subscriptions to sequences. Additionally, the sequence LSN >> (Log Sequence Number) is stored, facilitating determination of >> sequence changes occurring before or after the returned sequence >> state. > > > Can't it be done using pg_depend? It seems a bit excessive unless I'm missing > something. We'll require the lsn because the sequence LSN informs the user that it has been synchronized up to the LSN in pg_subscription_seq. Since we are not supporting incremental sync, the user will be able to identify if he should run refresh sequences or not by checking the lsn of the pg_subscription_seq and the lsn of the sequence(using pg_sequence_state added) in the publisher. Also, this parallels our implementation for pg_subscription_seq and will aid in expanding for a) incremental synchronization and b) utilizing workers for synchronization using sequence states if necessary. How do you track sequence mapping with the publication? In the publisher we use pg_publication_rel and pg_publication_namespace for mapping the sequences with the publication. Regards, Vignesh Vignesh
On Mon, 10 Jun 2024 at 14:48, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > Are you imagining the behavior for sequences associated with tables > > > > differently than the ones defined by the CREATE SEQUENCE .. command? I > > > > was thinking that users would associate sequences with publications > > > > similar to what we do for tables for both cases. For example, they > > > > need to explicitly mention the sequences they want to replicate by > > > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE > > > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR > > > > SEQUENCES IN SCHEMA sch1; > > > > > > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1 > > > > should copy both the explicitly defined sequences and sequences > > > > defined with the tables. Do you think a different variant for just > > > > copying sequences implicitly associated with tables (say for identity > > > > columns)? > > > > > > Oh, I was thinking that your proposal was to copy literally all > > > sequences by REPLICA/REFRESH SEQUENCE command. > > > > > I am trying to keep the behavior as close to tables as possible. > > > > But it seems to make > > > sense to explicitly specify the sequences they want to replicate. It > > > also means that they can create a publication that has only sequences. > > > In this case, even if they create a subscription for that publication, > > > we don't launch any apply workers for that subscription. Right? > > > > > Right, good point. I had not thought about this. > > > > Also, given that the main use case (at least as the first step) is > > > version upgrade, do we really need to support SEQUENCES IN SCHEMA and > > > even FOR SEQUENCE? > > > > At the very least, we can split the patch to move these variants to a > separate patch. Once the main patch is finalized, we can try to > evaluate the remaining separately. I engaged in an offline discussion with Amit about strategizing the division of patches to facilitate the review process. We agreed on the following split: The first patch will encompass the setting and getting of sequence values (core sequence changes). The second patch will cover all changes on the publisher side related to "FOR ALL SEQUENCES." The third patch will address subscriber side changes aimed at synchronizing "FOR ALL SEQUENCES" publications. The fourth patch will focus on supporting "FOR SEQUENCE" publication. Lastly, the fifth patch will introduce support for "FOR ALL SEQUENCES IN SCHEMA" publication. I will work on this and share an updated patch for the same soon. Regards, Vignesh
On Mon, Jun 10, 2024 at 5:00 PM vignesh C <vignesh21@gmail.com> wrote:
On Mon, 10 Jun 2024 at 12:24, Amul Sul <sulamul@gmail.com> wrote:
>
>
>
> On Sat, Jun 8, 2024 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote:
>>
>> On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> [...]
>> A new catalog table, pg_subscription_seq, has been introduced for
>> mapping subscriptions to sequences. Additionally, the sequence LSN
>> (Log Sequence Number) is stored, facilitating determination of
>> sequence changes occurring before or after the returned sequence
>> state.
>
>
> Can't it be done using pg_depend? It seems a bit excessive unless I'm missing
> something.
We'll require the lsn because the sequence LSN informs the user that
it has been synchronized up to the LSN in pg_subscription_seq. Since
we are not supporting incremental sync, the user will be able to
identify if he should run refresh sequences or not by checking the lsn
of the pg_subscription_seq and the lsn of the sequence(using
pg_sequence_state added) in the publisher. Also, this parallels our
implementation for pg_subscription_seq and will aid in expanding for
a) incremental synchronization and b) utilizing workers for
synchronization using sequence states if necessary.
How do you track sequence mapping with the publication?
In the publisher we use pg_publication_rel and
pg_publication_namespace for mapping the sequences with the
publication.
Thanks for the explanation. I'm wondering what the complexity would be, if we
wanted to do something similar on the subscriber side, i.e., tracking via
pg_subscription_rel.
wanted to do something similar on the subscriber side, i.e., tracking via
pg_subscription_rel.
Regards,
Amul
On Tue, 11 Jun 2024 at 09:41, Amul Sul <sulamul@gmail.com> wrote: > > On Mon, Jun 10, 2024 at 5:00 PM vignesh C <vignesh21@gmail.com> wrote: >> >> On Mon, 10 Jun 2024 at 12:24, Amul Sul <sulamul@gmail.com> wrote: >> > >> > >> > >> > On Sat, Jun 8, 2024 at 6:43 PM vignesh C <vignesh21@gmail.com> wrote: >> >> >> >> On Wed, 5 Jun 2024 at 14:11, Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> [...] >> >> A new catalog table, pg_subscription_seq, has been introduced for >> >> mapping subscriptions to sequences. Additionally, the sequence LSN >> >> (Log Sequence Number) is stored, facilitating determination of >> >> sequence changes occurring before or after the returned sequence >> >> state. >> > >> > >> > Can't it be done using pg_depend? It seems a bit excessive unless I'm missing >> > something. >> >> We'll require the lsn because the sequence LSN informs the user that >> it has been synchronized up to the LSN in pg_subscription_seq. Since >> we are not supporting incremental sync, the user will be able to >> identify if he should run refresh sequences or not by checking the lsn >> of the pg_subscription_seq and the lsn of the sequence(using >> pg_sequence_state added) in the publisher. Also, this parallels our >> implementation for pg_subscription_seq and will aid in expanding for >> a) incremental synchronization and b) utilizing workers for >> synchronization using sequence states if necessary. >> >> How do you track sequence mapping with the publication? >> >> In the publisher we use pg_publication_rel and >> pg_publication_namespace for mapping the sequences with the >> publication. > > > Thanks for the explanation. I'm wondering what the complexity would be, if we > wanted to do something similar on the subscriber side, i.e., tracking via > pg_subscription_rel. Because we won't utilize sync workers to synchronize the sequence, and the sequence won't necessitate sync states like init, sync, finishedcopy, syncdone, ready, etc., initially, I considered keeping the sequences separate. However, I'm ok with using pg_subscription_rel as it could potentially help in enhancing incremental synchronization and parallelizing later on. Regards, Vignesh
On Tue, Jun 11, 2024 at 12:25 PM vignesh C <vignesh21@gmail.com> wrote: > > On Mon, 10 Jun 2024 at 14:48, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > Are you imagining the behavior for sequences associated with tables > > > > > differently than the ones defined by the CREATE SEQUENCE .. command? I > > > > > was thinking that users would associate sequences with publications > > > > > similar to what we do for tables for both cases. For example, they > > > > > need to explicitly mention the sequences they want to replicate by > > > > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE > > > > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR > > > > > SEQUENCES IN SCHEMA sch1; > > > > > > > > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1 > > > > > should copy both the explicitly defined sequences and sequences > > > > > defined with the tables. Do you think a different variant for just > > > > > copying sequences implicitly associated with tables (say for identity > > > > > columns)? > > > > > > > > Oh, I was thinking that your proposal was to copy literally all > > > > sequences by REPLICA/REFRESH SEQUENCE command. > > > > > > > > I am trying to keep the behavior as close to tables as possible. > > > > > > But it seems to make > > > > sense to explicitly specify the sequences they want to replicate. It > > > > also means that they can create a publication that has only sequences. > > > > In this case, even if they create a subscription for that publication, > > > > we don't launch any apply workers for that subscription. Right? > > > > > > > > Right, good point. I had not thought about this. > > > > > > Also, given that the main use case (at least as the first step) is > > > > version upgrade, do we really need to support SEQUENCES IN SCHEMA and > > > > even FOR SEQUENCE? > > > > > > > At the very least, we can split the patch to move these variants to a > > separate patch. Once the main patch is finalized, we can try to > > evaluate the remaining separately. > > I engaged in an offline discussion with Amit about strategizing the > division of patches to facilitate the review process. We agreed on the > following split: The first patch will encompass the setting and > getting of sequence values (core sequence changes). The second patch > will cover all changes on the publisher side related to "FOR ALL > SEQUENCES." The third patch will address subscriber side changes aimed > at synchronizing "FOR ALL SEQUENCES" publications. The fourth patch > will focus on supporting "FOR SEQUENCE" publication. Lastly, the fifth > patch will introduce support for "FOR ALL SEQUENCES IN SCHEMA" > publication. > > I will work on this and share an updated patch for the same soon. +1. Sounds like a good plan. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, 11 Jun 2024 at 12:38, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Tue, Jun 11, 2024 at 12:25 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Mon, 10 Jun 2024 at 14:48, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > > > > Are you imagining the behavior for sequences associated with tables > > > > > > differently than the ones defined by the CREATE SEQUENCE .. command? I > > > > > > was thinking that users would associate sequences with publications > > > > > > similar to what we do for tables for both cases. For example, they > > > > > > need to explicitly mention the sequences they want to replicate by > > > > > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE > > > > > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR > > > > > > SEQUENCES IN SCHEMA sch1; > > > > > > > > > > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1 > > > > > > should copy both the explicitly defined sequences and sequences > > > > > > defined with the tables. Do you think a different variant for just > > > > > > copying sequences implicitly associated with tables (say for identity > > > > > > columns)? > > > > > > > > > > Oh, I was thinking that your proposal was to copy literally all > > > > > sequences by REPLICA/REFRESH SEQUENCE command. > > > > > > > > > > > I am trying to keep the behavior as close to tables as possible. > > > > > > > > But it seems to make > > > > > sense to explicitly specify the sequences they want to replicate. It > > > > > also means that they can create a publication that has only sequences. > > > > > In this case, even if they create a subscription for that publication, > > > > > we don't launch any apply workers for that subscription. Right? > > > > > > > > > > > Right, good point. I had not thought about this. > > > > > > > > Also, given that the main use case (at least as the first step) is > > > > > version upgrade, do we really need to support SEQUENCES IN SCHEMA and > > > > > even FOR SEQUENCE? > > > > > > > > > > At the very least, we can split the patch to move these variants to a > > > separate patch. Once the main patch is finalized, we can try to > > > evaluate the remaining separately. > > > > I engaged in an offline discussion with Amit about strategizing the > > division of patches to facilitate the review process. We agreed on the > > following split: The first patch will encompass the setting and > > getting of sequence values (core sequence changes). The second patch > > will cover all changes on the publisher side related to "FOR ALL > > SEQUENCES." The third patch will address subscriber side changes aimed > > at synchronizing "FOR ALL SEQUENCES" publications. The fourth patch > > will focus on supporting "FOR SEQUENCE" publication. Lastly, the fifth > > patch will introduce support for "FOR ALL SEQUENCES IN SCHEMA" > > publication. > > > > I will work on this and share an updated patch for the same soon. > > +1. Sounds like a good plan. Amit and I engaged in an offline discussion regarding the design and contemplated that it could be like below: 1) CREATE PUBLICATION syntax enhancement: CREATE PUBLICATION ... FOR ALL SEQUENCES; The addition of a new column titled "all sequences" in the pg_publication system table will signify whether the publication is designated as all sequences publication or not. 2) CREATE SUBSCRIPTION -- no syntax change. Upon creation of a subscription, the following additional steps will be managed by the subscriber: i) The subscriber will retrieve the list of sequences associated with the subscription's publications. ii) For each sequence: a) Retrieve the sequence value from the publisher by invoking the pg_sequence_state function. b) Set the sequence with the value obtained from the publisher. iv) Once the subscription creation is completed, all sequence values will become visible at the subscriber's end. An alternative design approach could involve retrieving the sequence list from the publisher during subscription creation and inserting the sequences with an "init" state into the pg_subscription_rel system table. These tasks could be executed by a single sequence sync worker, which would: i) Retrieve the list of sequences in the "init" state from the pg_subscription_rel system table. ii) Initiate a transaction. iii) For each sequence: a) Obtain the sequence value from the publisher by utilizing the pg_sequence_state function. b) Update the sequence with the value obtained from the publisher. iv) Commit the transaction. The benefit with the second approach is that if there are large number of sequences, the sequence sync can be enhanced to happen in parallel and also if there are any locks held on the sequences in the publisher, the sequence worker can wait to acquire the lock instead of blocking the whole create subscription command which will delay the initial copy of the tables too. 3) Refreshing the sequence can be achieved through the existing command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change here). The subscriber identifies stale sequences, meaning sequences present in pg_subscription_rel but absent from the publication, and removes them from the pg_subscription_rel system table. The subscriber also checks for newly added sequences in the publisher and synchronizes their values from the publisher using the steps outlined in the subscription creation process. It's worth noting that previously synchronized sequences won't be synchronized again; the sequence sync will occur solely for the newly added sequences. 4) Introducing a new command for refreshing all sequences: ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. The subscriber will remove stale sequences and add newly added sequences from the publisher. Following this, it will re-synchronize the sequence values for all sequences in the updated list from the publisher, following the steps outlined in the subscription creation process. 5) Incorporate the pg_sequence_state function to fetch the sequence value from the publisher, along with the page LSN. Incorporate SetSequence function, which will procure a new relfilenode for the sequence and set the new relfilenode with the specified value. This will facilitate rollback in case of any failures. Thoughts? Regards, Vignesh
On Tue, Jun 11, 2024 at 7:36 PM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 11 Jun 2024 at 12:38, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Tue, Jun 11, 2024 at 12:25 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Mon, 10 Jun 2024 at 14:48, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > On Mon, Jun 10, 2024 at 12:43 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > On Mon, Jun 10, 2024 at 3:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > On Fri, Jun 7, 2024 at 7:30 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > > Are you imagining the behavior for sequences associated with tables > > > > > > > differently than the ones defined by the CREATE SEQUENCE .. command? I > > > > > > > was thinking that users would associate sequences with publications > > > > > > > similar to what we do for tables for both cases. For example, they > > > > > > > need to explicitly mention the sequences they want to replicate by > > > > > > > commands like CREATE PUBLICATION ... FOR SEQUENCE s1, s2, ...; CREATE > > > > > > > PUBLICATION ... FOR ALL SEQUENCES, or CREATE PUBLICATION ... FOR > > > > > > > SEQUENCES IN SCHEMA sch1; > > > > > > > > > > > > > > In this, variants FOR ALL SEQUENCES and SEQUENCES IN SCHEMA sch1 > > > > > > > should copy both the explicitly defined sequences and sequences > > > > > > > defined with the tables. Do you think a different variant for just > > > > > > > copying sequences implicitly associated with tables (say for identity > > > > > > > columns)? > > > > > > > > > > > > Oh, I was thinking that your proposal was to copy literally all > > > > > > sequences by REPLICA/REFRESH SEQUENCE command. > > > > > > > > > > > > > > I am trying to keep the behavior as close to tables as possible. > > > > > > > > > > But it seems to make > > > > > > sense to explicitly specify the sequences they want to replicate. It > > > > > > also means that they can create a publication that has only sequences. > > > > > > In this case, even if they create a subscription for that publication, > > > > > > we don't launch any apply workers for that subscription. Right? > > > > > > > > > > > > > > Right, good point. I had not thought about this. > > > > > > > > > > Also, given that the main use case (at least as the first step) is > > > > > > version upgrade, do we really need to support SEQUENCES IN SCHEMA and > > > > > > even FOR SEQUENCE? > > > > > > > > > > > > > At the very least, we can split the patch to move these variants to a > > > > separate patch. Once the main patch is finalized, we can try to > > > > evaluate the remaining separately. > > > > > > I engaged in an offline discussion with Amit about strategizing the > > > division of patches to facilitate the review process. We agreed on the > > > following split: The first patch will encompass the setting and > > > getting of sequence values (core sequence changes). The second patch > > > will cover all changes on the publisher side related to "FOR ALL > > > SEQUENCES." The third patch will address subscriber side changes aimed > > > at synchronizing "FOR ALL SEQUENCES" publications. The fourth patch > > > will focus on supporting "FOR SEQUENCE" publication. Lastly, the fifth > > > patch will introduce support for "FOR ALL SEQUENCES IN SCHEMA" > > > publication. > > > > > > I will work on this and share an updated patch for the same soon. > > > > +1. Sounds like a good plan. > > Amit and I engaged in an offline discussion regarding the design and > contemplated that it could be like below: > 1) CREATE PUBLICATION syntax enhancement: > CREATE PUBLICATION ... FOR ALL SEQUENCES; > The addition of a new column titled "all sequences" in the > pg_publication system table will signify whether the publication is > designated as all sequences publication or not. > The first approach sounds like we don't create entries for sequences in pg_subscription_rel. In this case, how do we know all sequences that we need to refresh when executing the REFRESH PUBLICATION SEQUENCES command you mentioned below? > 2) CREATE SUBSCRIPTION -- no syntax change. > Upon creation of a subscription, the following additional steps will > be managed by the subscriber: > i) The subscriber will retrieve the list of sequences associated with > the subscription's publications. > ii) For each sequence: a) Retrieve the sequence value from the > publisher by invoking the pg_sequence_state function. b) Set the > sequence with the value obtained from the publisher. iv) Once the > subscription creation is completed, all sequence values will become > visible at the subscriber's end. Sequence values are always copied from the publisher? or does it happen only when copy_data = true? > > An alternative design approach could involve retrieving the sequence > list from the publisher during subscription creation and inserting the > sequences with an "init" state into the pg_subscription_rel system > table. These tasks could be executed by a single sequence sync worker, > which would: > i) Retrieve the list of sequences in the "init" state from the > pg_subscription_rel system table. > ii) Initiate a transaction. > iii) For each sequence: a) Obtain the sequence value from the > publisher by utilizing the pg_sequence_state function. b) Update the > sequence with the value obtained from the publisher. > iv) Commit the transaction. > > The benefit with the second approach is that if there are large number > of sequences, the sequence sync can be enhanced to happen in parallel > and also if there are any locks held on the sequences in the > publisher, the sequence worker can wait to acquire the lock instead of > blocking the whole create subscription command which will delay the > initial copy of the tables too. I prefer to have separate workers to sync sequences. Probably we can start with a single worker and extend it to have multiple workers. BTW the sequence-sync worker will be taken from max_sync_workers_per_subscription pool? Or yet another idea I came up with is that a tablesync worker will synchronize both the table and sequences owned by the table. That is, after the tablesync worker caught up with the apply worker, the tablesync worker synchronizes sequences associated with the target table as well. One benefit would be that at the time of initial table sync being completed, the table and its sequence data are consistent. As soon as new changes come to the table, it would become inconsistent so it might not be helpful much, though. Also, sequences that are not owned by any table will still need to be synchronized by someone. > > 3) Refreshing the sequence can be achieved through the existing > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > here). > The subscriber identifies stale sequences, meaning sequences present > in pg_subscription_rel but absent from the publication, and removes > them from the pg_subscription_rel system table. The subscriber also > checks for newly added sequences in the publisher and synchronizes > their values from the publisher using the steps outlined in the > subscription creation process. It's worth noting that previously > synchronized sequences won't be synchronized again; the sequence sync > will occur solely for the newly added sequences. > > 4) Introducing a new command for refreshing all sequences: ALTER > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > The subscriber will remove stale sequences and add newly added > sequences from the publisher. Following this, it will re-synchronize > the sequence values for all sequences in the updated list from the > publisher, following the steps outlined in the subscription creation > process. The difference between 3) and 4) is whether or not to re-synchronize the previously synchronized sequences. Do we really want to introduce a new command for 4)? I felt that we can invent an option say copy_all_sequence for the REFRESH PUBLICATION command to cover the 4) case. > > 5) Incorporate the pg_sequence_state function to fetch the sequence > value from the publisher, along with the page LSN. Incorporate > SetSequence function, which will procure a new relfilenode for the > sequence and set the new relfilenode with the specified value. This > will facilitate rollback in case of any failures. Does it mean that we create a new relfilenode for every update of the value? Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21@gmail.com> wrote: > > Amit and I engaged in an offline discussion regarding the design and > contemplated that it could be like below: If I understand correctly, does this require the sequences to already exist on the subscribing node before creating the subscription, or will it also copy any non-existing sequences? > 1) CREATE PUBLICATION syntax enhancement: > CREATE PUBLICATION ... FOR ALL SEQUENCES; > The addition of a new column titled "all sequences" in the > pg_publication system table will signify whether the publication is > designated as all sequences publication or not. > > 2) CREATE SUBSCRIPTION -- no syntax change. > Upon creation of a subscription, the following additional steps will > be managed by the subscriber: > i) The subscriber will retrieve the list of sequences associated with > the subscription's publications. > ii) For each sequence: a) Retrieve the sequence value from the > publisher by invoking the pg_sequence_state function. b) Set the > sequence with the value obtained from the publisher. iv) Once the > subscription creation is completed, all sequence values will become > visible at the subscriber's end. > > An alternative design approach could involve retrieving the sequence > list from the publisher during subscription creation and inserting the > sequences with an "init" state into the pg_subscription_rel system > table. These tasks could be executed by a single sequence sync worker, > which would: > i) Retrieve the list of sequences in the "init" state from the > pg_subscription_rel system table. > ii) Initiate a transaction. > iii) For each sequence: a) Obtain the sequence value from the > publisher by utilizing the pg_sequence_state function. b) Update the > sequence with the value obtained from the publisher. > iv) Commit the transaction. > > The benefit with the second approach is that if there are large number > of sequences, the sequence sync can be enhanced to happen in parallel > and also if there are any locks held on the sequences in the > publisher, the sequence worker can wait to acquire the lock instead of > blocking the whole create subscription command which will delay the > initial copy of the tables too. Yeah w.r.t. this point second approach seems better. > 3) Refreshing the sequence can be achieved through the existing > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > here). > The subscriber identifies stale sequences, meaning sequences present > in pg_subscription_rel but absent from the publication, and removes > them from the pg_subscription_rel system table. The subscriber also > checks for newly added sequences in the publisher and synchronizes > their values from the publisher using the steps outlined in the > subscription creation process. It's worth noting that previously > synchronized sequences won't be synchronized again; the sequence sync > will occur solely for the newly added sequences. > 4) Introducing a new command for refreshing all sequences: ALTER > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > The subscriber will remove stale sequences and add newly added > sequences from the publisher. Following this, it will re-synchronize > the sequence values for all sequences in the updated list from the > publisher, following the steps outlined in the subscription creation > process. Okay, this answers my first question: we will remove the sequences that are removed from the publisher and add the new sequences. I don't see any problem with this, but doesn't it seem like we are effectively doing DDL replication only for sequences without having a comprehensive plan for overall DDL replication? > 5) Incorporate the pg_sequence_state function to fetch the sequence > value from the publisher, along with the page LSN. Incorporate > SetSequence function, which will procure a new relfilenode for the > sequence and set the new relfilenode with the specified value. This > will facilitate rollback in case of any failures. I do not understand this point, you mean whenever we are fetching the sequence value from the publisher we need to create a new relfilenode on the subscriber? Why not just update the catalog tuple is sufficient? Or this is for handling the ALTER SEQUENCE case? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Wed, Jun 12, 2024 at 10:44 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Tue, Jun 11, 2024 at 7:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > 1) CREATE PUBLICATION syntax enhancement: > > CREATE PUBLICATION ... FOR ALL SEQUENCES; > > The addition of a new column titled "all sequences" in the > > pg_publication system table will signify whether the publication is > > designated as all sequences publication or not. > > > > The first approach sounds like we don't create entries for sequences > in pg_subscription_rel. In this case, how do we know all sequences > that we need to refresh when executing the REFRESH PUBLICATION > SEQUENCES command you mentioned below? > As per my understanding, we should be creating entries for sequences in pg_subscription_rel similar to tables. The difference would be that we won't need all the sync_states (i = initialize, d = data is being copied, f = finished table copy, s = synchronized, r = ready) as we don't need any synchronization with apply workers. > > 2) CREATE SUBSCRIPTION -- no syntax change. > > Upon creation of a subscription, the following additional steps will > > be managed by the subscriber: > > i) The subscriber will retrieve the list of sequences associated with > > the subscription's publications. > > ii) For each sequence: a) Retrieve the sequence value from the > > publisher by invoking the pg_sequence_state function. b) Set the > > sequence with the value obtained from the publisher. iv) Once the > > subscription creation is completed, all sequence values will become > > visible at the subscriber's end. > > Sequence values are always copied from the publisher? or does it > happen only when copy_data = true? > It is better to do it when "copy_data = true" to keep it compatible with the table's behavior. > > > > An alternative design approach could involve retrieving the sequence > > list from the publisher during subscription creation and inserting the > > sequences with an "init" state into the pg_subscription_rel system > > table. These tasks could be executed by a single sequence sync worker, > > which would: > > i) Retrieve the list of sequences in the "init" state from the > > pg_subscription_rel system table. > > ii) Initiate a transaction. > > iii) For each sequence: a) Obtain the sequence value from the > > publisher by utilizing the pg_sequence_state function. b) Update the > > sequence with the value obtained from the publisher. > > iv) Commit the transaction. > > > > The benefit with the second approach is that if there are large number > > of sequences, the sequence sync can be enhanced to happen in parallel > > and also if there are any locks held on the sequences in the > > publisher, the sequence worker can wait to acquire the lock instead of > > blocking the whole create subscription command which will delay the > > initial copy of the tables too. > > I prefer to have separate workers to sync sequences. > +1. > Probably we can > start with a single worker and extend it to have multiple workers. Yeah, starting with a single worker sounds good for now. Do you think we should sync all the sequences in a single transaction or have some threshold value above which a different transaction would be required or maybe a different sequence sync worker altogether? Now, having multiple sequence-sync workers requires some synchronization so that only a single worker is allocated for one sequence. The simplest thing is to use a single sequence sync worker that syncs all sequences in one transaction but with a large number of sequences, it could be inefficient. OTOH, I am not sure if it would be a problem in reality. > > BTW > the sequence-sync worker will be taken from > max_sync_workers_per_subscription pool? > I think so. > Or yet another idea I came up with is that a tablesync worker will > synchronize both the table and sequences owned by the table. That is, > after the tablesync worker caught up with the apply worker, the > tablesync worker synchronizes sequences associated with the target > table as well. One benefit would be that at the time of initial table > sync being completed, the table and its sequence data are consistent. > As soon as new changes come to the table, it would become inconsistent > so it might not be helpful much, though. Also, sequences that are not > owned by any table will still need to be synchronized by someone. > The other thing to consider in this idea is that we somehow need to distinguish the sequences owned by the table. > > > > 3) Refreshing the sequence can be achieved through the existing > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > > here). > > The subscriber identifies stale sequences, meaning sequences present > > in pg_subscription_rel but absent from the publication, and removes > > them from the pg_subscription_rel system table. The subscriber also > > checks for newly added sequences in the publisher and synchronizes > > their values from the publisher using the steps outlined in the > > subscription creation process. It's worth noting that previously > > synchronized sequences won't be synchronized again; the sequence sync > > will occur solely for the newly added sequences. > > > > 4) Introducing a new command for refreshing all sequences: ALTER > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > > The subscriber will remove stale sequences and add newly added > > sequences from the publisher. Following this, it will re-synchronize > > the sequence values for all sequences in the updated list from the > > publisher, following the steps outlined in the subscription creation > > process. > > The difference between 3) and 4) is whether or not to re-synchronize > the previously synchronized sequences. Do we really want to introduce > a new command for 4)? I felt that we can invent an option say > copy_all_sequence for the REFRESH PUBLICATION command to cover the 4) > case. > Yeah, that is also an option but it could confuse along with copy_data option. Say the user has selected copy_data = false but copy_all_sequences = true then the first option indicates to *not* copy the data of table and sequences and the second option indicates to copy the sequences data which sounds contradictory. The other idea is to have an option copy_existing_sequences (which indicates to copy existing sequence values) but that also has somewhat the same drawback as copy_all_sequences but to a lesser degree. > > > > 5) Incorporate the pg_sequence_state function to fetch the sequence > > value from the publisher, along with the page LSN. Incorporate > > SetSequence function, which will procure a new relfilenode for the > > sequence and set the new relfilenode with the specified value. This > > will facilitate rollback in case of any failures. > > Does it mean that we create a new relfilenode for every update of the value? > We need it for initial sync so that if there is an error both the sequence state in pg_subscription_rel and sequence values can be rolled back together. However, it is unclear whether we need to create a new relfilenode while copying existing sequences (say during ALTER SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we decide)? Probably the answer lies in how we want to implement this command. If we want to copy all sequence values during the command itself then it is probably okay but if we want to handover this task to the sequence-sync worker then we need some state management and a new relfilenode so that on error both state and sequence values are rolled back. -- With Regards, Amit Kapila.
On Wed, 12 Jun 2024 at 10:51, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21@gmail.com> wrote: > > > > Amit and I engaged in an offline discussion regarding the design and > > contemplated that it could be like below: > > If I understand correctly, does this require the sequences to already > exist on the subscribing node before creating the subscription, or > will it also copy any non-existing sequences? Sequences must exist in the subscriber; we'll synchronize only their values. Any sequences that are not present in the subscriber will trigger an error. > > 1) CREATE PUBLICATION syntax enhancement: > > CREATE PUBLICATION ... FOR ALL SEQUENCES; > > The addition of a new column titled "all sequences" in the > > pg_publication system table will signify whether the publication is > > designated as all sequences publication or not. > > > > 2) CREATE SUBSCRIPTION -- no syntax change. > > Upon creation of a subscription, the following additional steps will > > be managed by the subscriber: > > i) The subscriber will retrieve the list of sequences associated with > > the subscription's publications. > > ii) For each sequence: a) Retrieve the sequence value from the > > publisher by invoking the pg_sequence_state function. b) Set the > > sequence with the value obtained from the publisher. iv) Once the > > subscription creation is completed, all sequence values will become > > visible at the subscriber's end. > > > > An alternative design approach could involve retrieving the sequence > > list from the publisher during subscription creation and inserting the > > sequences with an "init" state into the pg_subscription_rel system > > table. These tasks could be executed by a single sequence sync worker, > > which would: > > i) Retrieve the list of sequences in the "init" state from the > > pg_subscription_rel system table. > > ii) Initiate a transaction. > > iii) For each sequence: a) Obtain the sequence value from the > > publisher by utilizing the pg_sequence_state function. b) Update the > > sequence with the value obtained from the publisher. > > iv) Commit the transaction. > > > > The benefit with the second approach is that if there are large number > > of sequences, the sequence sync can be enhanced to happen in parallel > > and also if there are any locks held on the sequences in the > > publisher, the sequence worker can wait to acquire the lock instead of > > blocking the whole create subscription command which will delay the > > initial copy of the tables too. > > Yeah w.r.t. this point second approach seems better. ok > > 3) Refreshing the sequence can be achieved through the existing > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > > here). > > The subscriber identifies stale sequences, meaning sequences present > > in pg_subscription_rel but absent from the publication, and removes > > them from the pg_subscription_rel system table. The subscriber also > > checks for newly added sequences in the publisher and synchronizes > > their values from the publisher using the steps outlined in the > > subscription creation process. It's worth noting that previously > > synchronized sequences won't be synchronized again; the sequence sync > > will occur solely for the newly added sequences. > > > 4) Introducing a new command for refreshing all sequences: ALTER > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > > The subscriber will remove stale sequences and add newly added > > sequences from the publisher. Following this, it will re-synchronize > > the sequence values for all sequences in the updated list from the > > publisher, following the steps outlined in the subscription creation > > process. > > Okay, this answers my first question: we will remove the sequences > that are removed from the publisher and add the new sequences. I don't > see any problem with this, but doesn't it seem like we are effectively > doing DDL replication only for sequences without having a > comprehensive plan for overall DDL replication? What I intended to convey is that we'll eliminate the sequences from pg_subscription_rel. We won't facilitate the DDL replication of sequences; instead, we anticipate users to create the sequences themselves. > > 5) Incorporate the pg_sequence_state function to fetch the sequence > > value from the publisher, along with the page LSN. Incorporate > > SetSequence function, which will procure a new relfilenode for the > > sequence and set the new relfilenode with the specified value. This > > will facilitate rollback in case of any failures. > > I do not understand this point, you mean whenever we are fetching the > sequence value from the publisher we need to create a new relfilenode > on the subscriber? Why not just update the catalog tuple is > sufficient? Or this is for handling the ALTER SEQUENCE case? Sequences operate distinctively from tables. Alterations to sequences reflect instantly in another session, even before committing the transaction. To ensure the synchronization of sequence value and state updates in pg_subscription_rel, we assign it a new relfilenode. This strategy ensures that any potential errors allow for the rollback of both the sequence state in pg_subscription_rel and the sequence values simultaneously. Regards, Vignesh
On Wed, Jun 12, 2024 at 4:08 PM vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 12 Jun 2024 at 10:51, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > Amit and I engaged in an offline discussion regarding the design and > > > contemplated that it could be like below: > > > > If I understand correctly, does this require the sequences to already > > exist on the subscribing node before creating the subscription, or > > will it also copy any non-existing sequences? > > Sequences must exist in the subscriber; we'll synchronize only their > values. Any sequences that are not present in the subscriber will > trigger an error. Okay, that makes sense. > > > > 3) Refreshing the sequence can be achieved through the existing > > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > > > here). > > > The subscriber identifies stale sequences, meaning sequences present > > > in pg_subscription_rel but absent from the publication, and removes > > > them from the pg_subscription_rel system table. The subscriber also > > > checks for newly added sequences in the publisher and synchronizes > > > their values from the publisher using the steps outlined in the > > > subscription creation process. It's worth noting that previously > > > synchronized sequences won't be synchronized again; the sequence sync > > > will occur solely for the newly added sequences. > > > > > 4) Introducing a new command for refreshing all sequences: ALTER > > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > > > The subscriber will remove stale sequences and add newly added > > > sequences from the publisher. Following this, it will re-synchronize > > > the sequence values for all sequences in the updated list from the > > > publisher, following the steps outlined in the subscription creation > > > process. > > > > Okay, this answers my first question: we will remove the sequences > > that are removed from the publisher and add the new sequences. I don't > > see any problem with this, but doesn't it seem like we are effectively > > doing DDL replication only for sequences without having a > > comprehensive plan for overall DDL replication? > > What I intended to convey is that we'll eliminate the sequences from > pg_subscription_rel. We won't facilitate the DDL replication of > sequences; instead, we anticipate users to create the sequences > themselves. hmm okay. > > > 5) Incorporate the pg_sequence_state function to fetch the sequence > > > value from the publisher, along with the page LSN. Incorporate > > > SetSequence function, which will procure a new relfilenode for the > > > sequence and set the new relfilenode with the specified value. This > > > will facilitate rollback in case of any failures. > > > > I do not understand this point, you mean whenever we are fetching the > > sequence value from the publisher we need to create a new relfilenode > > on the subscriber? Why not just update the catalog tuple is > > sufficient? Or this is for handling the ALTER SEQUENCE case? > > Sequences operate distinctively from tables. Alterations to sequences > reflect instantly in another session, even before committing the > transaction. To ensure the synchronization of sequence value and state > updates in pg_subscription_rel, we assign it a new relfilenode. This > strategy ensures that any potential errors allow for the rollback of > both the sequence state in pg_subscription_rel and the sequence values > simultaneously. So, you're saying that when we synchronize the sequence values on the subscriber side, we will create a new relfilenode to allow reverting to the old state of the sequence in case of an error or transaction rollback? But why would we want to do that? Generally, even if you call nextval() on a sequence and then roll back the transaction, the sequence value doesn't revert to the old value. So, what specific problem on the subscriber side are we trying to avoid by operating on a new relfilenode? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Wed, 12 Jun 2024 at 17:09, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Wed, Jun 12, 2024 at 4:08 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, 12 Jun 2024 at 10:51, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Tue, Jun 11, 2024 at 4:06 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > > > Amit and I engaged in an offline discussion regarding the design and > > > > contemplated that it could be like below: > > > > > > If I understand correctly, does this require the sequences to already > > > exist on the subscribing node before creating the subscription, or > > > will it also copy any non-existing sequences? > > > > Sequences must exist in the subscriber; we'll synchronize only their > > values. Any sequences that are not present in the subscriber will > > trigger an error. > > Okay, that makes sense. > > > > > > > 3) Refreshing the sequence can be achieved through the existing > > > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > > > > here). > > > > The subscriber identifies stale sequences, meaning sequences present > > > > in pg_subscription_rel but absent from the publication, and removes > > > > them from the pg_subscription_rel system table. The subscriber also > > > > checks for newly added sequences in the publisher and synchronizes > > > > their values from the publisher using the steps outlined in the > > > > subscription creation process. It's worth noting that previously > > > > synchronized sequences won't be synchronized again; the sequence sync > > > > will occur solely for the newly added sequences. > > > > > > > 4) Introducing a new command for refreshing all sequences: ALTER > > > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > > > > The subscriber will remove stale sequences and add newly added > > > > sequences from the publisher. Following this, it will re-synchronize > > > > the sequence values for all sequences in the updated list from the > > > > publisher, following the steps outlined in the subscription creation > > > > process. > > > > > > Okay, this answers my first question: we will remove the sequences > > > that are removed from the publisher and add the new sequences. I don't > > > see any problem with this, but doesn't it seem like we are effectively > > > doing DDL replication only for sequences without having a > > > comprehensive plan for overall DDL replication? > > > > What I intended to convey is that we'll eliminate the sequences from > > pg_subscription_rel. We won't facilitate the DDL replication of > > sequences; instead, we anticipate users to create the sequences > > themselves. > > hmm okay. > > > > > 5) Incorporate the pg_sequence_state function to fetch the sequence > > > > value from the publisher, along with the page LSN. Incorporate > > > > SetSequence function, which will procure a new relfilenode for the > > > > sequence and set the new relfilenode with the specified value. This > > > > will facilitate rollback in case of any failures. > > > > > > I do not understand this point, you mean whenever we are fetching the > > > sequence value from the publisher we need to create a new relfilenode > > > on the subscriber? Why not just update the catalog tuple is > > > sufficient? Or this is for handling the ALTER SEQUENCE case? > > > > Sequences operate distinctively from tables. Alterations to sequences > > reflect instantly in another session, even before committing the > > transaction. To ensure the synchronization of sequence value and state > > updates in pg_subscription_rel, we assign it a new relfilenode. This > > strategy ensures that any potential errors allow for the rollback of > > both the sequence state in pg_subscription_rel and the sequence values > > simultaneously. > > So, you're saying that when we synchronize the sequence values on the > subscriber side, we will create a new relfilenode to allow reverting > to the old state of the sequence in case of an error or transaction > rollback? But why would we want to do that? Generally, even if you > call nextval() on a sequence and then roll back the transaction, the > sequence value doesn't revert to the old value. So, what specific > problem on the subscriber side are we trying to avoid by operating on > a new relfilenode? Let's consider a situation where we have two sequences: seq1 with a value of 100 and seq2 with a value of 200. Now, let's say seq1 is synced and updated to 100, then we attempt to synchronize seq2, there's a failure due to the sequence not existing or encountering some other issue. In this scenario, we don't want to halt operations where seq1 is synchronized, but the sequence state for sequence isn't changed to "ready" in pg_subscription_rel. Updating the sequence data directly reflects the sequence change immediately. However, if we assign a new relfile node for the sequence and update the sequence value for the new relfile node, until the transaction is committed, other concurrent users will still be utilizing the old relfile node for the sequence, and only the old data will be visible. Once all sequences are synchronized, and the sequence state is updated in pg_subscription_rel, the transaction will either be committed or aborted. If committed, users will be able to observe the new sequence values because the sequences will be updated with the new relfile node containing the updated sequence value. Regards, Vignesh
On Thu, Jun 13, 2024 at 10:10 AM vignesh C <vignesh21@gmail.com> wrote: > > > So, you're saying that when we synchronize the sequence values on the > > subscriber side, we will create a new relfilenode to allow reverting > > to the old state of the sequence in case of an error or transaction > > rollback? But why would we want to do that? Generally, even if you > > call nextval() on a sequence and then roll back the transaction, the > > sequence value doesn't revert to the old value. So, what specific > > problem on the subscriber side are we trying to avoid by operating on > > a new relfilenode? > > Let's consider a situation where we have two sequences: seq1 with a > value of 100 and seq2 with a value of 200. Now, let's say seq1 is > synced and updated to 100, then we attempt to synchronize seq2, > there's a failure due to the sequence not existing or encountering > some other issue. In this scenario, we don't want to halt operations > where seq1 is synchronized, but the sequence state for sequence isn't > changed to "ready" in pg_subscription_rel. Thanks for the explanation, but I am still not getting it completely, do you mean to say unless all the sequences are not synced any of the sequences would not be marked "ready" in pg_subscription_rel? Is that necessary? I mean why we can not sync the sequences one by one and mark them ready? Why it is necessary to either have all the sequences synced or none of them? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Thu, 13 Jun 2024 at 10:27, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Thu, Jun 13, 2024 at 10:10 AM vignesh C <vignesh21@gmail.com> wrote: > > > > > So, you're saying that when we synchronize the sequence values on the > > > subscriber side, we will create a new relfilenode to allow reverting > > > to the old state of the sequence in case of an error or transaction > > > rollback? But why would we want to do that? Generally, even if you > > > call nextval() on a sequence and then roll back the transaction, the > > > sequence value doesn't revert to the old value. So, what specific > > > problem on the subscriber side are we trying to avoid by operating on > > > a new relfilenode? > > > > Let's consider a situation where we have two sequences: seq1 with a > > value of 100 and seq2 with a value of 200. Now, let's say seq1 is > > synced and updated to 100, then we attempt to synchronize seq2, > > there's a failure due to the sequence not existing or encountering > > some other issue. In this scenario, we don't want to halt operations > > where seq1 is synchronized, but the sequence state for sequence isn't > > changed to "ready" in pg_subscription_rel. > > Thanks for the explanation, but I am still not getting it completely, > do you mean to say unless all the sequences are not synced any of the > sequences would not be marked "ready" in pg_subscription_rel? Is that > necessary? I mean why we can not sync the sequences one by one and > mark them ready? Why it is necessary to either have all the sequences > synced or none of them? Since updating the sequence is one operation and setting pg_subscription_rel is another, I was trying to avoid a situation where the sequence is updated but its state is not reflected in pg_subscription_rel. It seems you are suggesting that it's acceptable for the sequence to be updated even if its state isn't updated in pg_subscription_rel, and in such cases, the sequence value does not need to be reverted. Regards, Vignesh
On Thu, Jun 13, 2024 at 11:53 AM vignesh C <vignesh21@gmail.com> wrote: > > On Thu, 13 Jun 2024 at 10:27, Dilip Kumar <dilipbalaut@gmail.com> wrote: > > Thanks for the explanation, but I am still not getting it completely, > > do you mean to say unless all the sequences are not synced any of the > > sequences would not be marked "ready" in pg_subscription_rel? Is that > > necessary? I mean why we can not sync the sequences one by one and > > mark them ready? Why it is necessary to either have all the sequences > > synced or none of them? > > Since updating the sequence is one operation and setting > pg_subscription_rel is another, I was trying to avoid a situation > where the sequence is updated but its state is not reflected in > pg_subscription_rel. It seems you are suggesting that it's acceptable > for the sequence to be updated even if its state isn't updated in > pg_subscription_rel, and in such cases, the sequence value does not > need to be reverted. Right, the complexity we're adding to achieve a behavior that may not be truly desirable is a concern. For instance, if we mark the status as ready but do not sync the sequences, it could lead to issues. However, if we have synced some sequences but encounter a failure without marking the status as ready, I don't consider it inconsistent in any way. But anyway, now I understand your thinking behind that so it's a good idea to leave this design behavior for a later decision. Gathering more opinions and insights during later stages will provide a clearer perspective on how to proceed with this aspect. Thanks. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 12, 2024 at 10:44 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Tue, Jun 11, 2024 at 7:36 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > 1) CREATE PUBLICATION syntax enhancement: > > > CREATE PUBLICATION ... FOR ALL SEQUENCES; > > > The addition of a new column titled "all sequences" in the > > > pg_publication system table will signify whether the publication is > > > designated as all sequences publication or not. > > > > > > > The first approach sounds like we don't create entries for sequences > > in pg_subscription_rel. In this case, how do we know all sequences > > that we need to refresh when executing the REFRESH PUBLICATION > > SEQUENCES command you mentioned below? > > > > As per my understanding, we should be creating entries for sequences > in pg_subscription_rel similar to tables. The difference would be that > we won't need all the sync_states (i = initialize, d = data is being > copied, f = finished table copy, s = synchronized, r = ready) as we > don't need any synchronization with apply workers. Agreed. > > > > 2) CREATE SUBSCRIPTION -- no syntax change. > > > Upon creation of a subscription, the following additional steps will > > > be managed by the subscriber: > > > i) The subscriber will retrieve the list of sequences associated with > > > the subscription's publications. > > > ii) For each sequence: a) Retrieve the sequence value from the > > > publisher by invoking the pg_sequence_state function. b) Set the > > > sequence with the value obtained from the publisher. iv) Once the > > > subscription creation is completed, all sequence values will become > > > visible at the subscriber's end. > > > > Sequence values are always copied from the publisher? or does it > > happen only when copy_data = true? > > > > It is better to do it when "copy_data = true" to keep it compatible > with the table's behavior. +1 > > > Probably we can > > start with a single worker and extend it to have multiple workers. > > Yeah, starting with a single worker sounds good for now. Do you think > we should sync all the sequences in a single transaction or have some > threshold value above which a different transaction would be required > or maybe a different sequence sync worker altogether? Now, having > multiple sequence-sync workers requires some synchronization so that > only a single worker is allocated for one sequence. > > The simplest thing is to use a single sequence sync worker that syncs > all sequences in one transaction but with a large number of sequences, > it could be inefficient. OTOH, I am not sure if it would be a problem > in reality. I think that we can start with using a single worker and one transaction, and measure the performance with a large number of sequences. > > Or yet another idea I came up with is that a tablesync worker will > > synchronize both the table and sequences owned by the table. That is, > > after the tablesync worker caught up with the apply worker, the > > tablesync worker synchronizes sequences associated with the target > > table as well. One benefit would be that at the time of initial table > > sync being completed, the table and its sequence data are consistent. Correction; it's not guaranteed that the sequence data and table data are consistent even in this case since the tablesync worker could get on-disk sequence data that might have already been updated. > > As soon as new changes come to the table, it would become inconsistent > > so it might not be helpful much, though. Also, sequences that are not > > owned by any table will still need to be synchronized by someone. > > > > The other thing to consider in this idea is that we somehow need to > distinguish the sequences owned by the table. I think we can check pg_depend. The owned sequences reference to the table. > > > > > > > 3) Refreshing the sequence can be achieved through the existing > > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > > > here). > > > The subscriber identifies stale sequences, meaning sequences present > > > in pg_subscription_rel but absent from the publication, and removes > > > them from the pg_subscription_rel system table. The subscriber also > > > checks for newly added sequences in the publisher and synchronizes > > > their values from the publisher using the steps outlined in the > > > subscription creation process. It's worth noting that previously > > > synchronized sequences won't be synchronized again; the sequence sync > > > will occur solely for the newly added sequences. > > > > > > 4) Introducing a new command for refreshing all sequences: ALTER > > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > > > The subscriber will remove stale sequences and add newly added > > > sequences from the publisher. Following this, it will re-synchronize > > > the sequence values for all sequences in the updated list from the > > > publisher, following the steps outlined in the subscription creation > > > process. > > > > The difference between 3) and 4) is whether or not to re-synchronize > > the previously synchronized sequences. Do we really want to introduce > > a new command for 4)? I felt that we can invent an option say > > copy_all_sequence for the REFRESH PUBLICATION command to cover the 4) > > case. > > > > Yeah, that is also an option but it could confuse along with copy_data > option. Say the user has selected copy_data = false but > copy_all_sequences = true then the first option indicates to *not* > copy the data of table and sequences and the second option indicates > to copy the sequences data which sounds contradictory. The other idea > is to have an option copy_existing_sequences (which indicates to copy > existing sequence values) but that also has somewhat the same drawback > as copy_all_sequences but to a lesser degree. Good point. And I understood that the REFRESH PUBLICATION SEQUENCES command would be helpful when users want to synchronize sequences between two nodes before upgrading. > > > > > > > 5) Incorporate the pg_sequence_state function to fetch the sequence > > > value from the publisher, along with the page LSN. Incorporate > > > SetSequence function, which will procure a new relfilenode for the > > > sequence and set the new relfilenode with the specified value. This > > > will facilitate rollback in case of any failures. > > > > Does it mean that we create a new relfilenode for every update of the value? > > > > We need it for initial sync so that if there is an error both the > sequence state in pg_subscription_rel and sequence values can be > rolled back together. Agreed. > However, it is unclear whether we need to create > a new relfilenode while copying existing sequences (say during ALTER > SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we > decide)? Probably the answer lies in how we want to implement this > command. If we want to copy all sequence values during the command > itself then it is probably okay but if we want to handover this task > to the sequence-sync worker then we need some state management and a > new relfilenode so that on error both state and sequence values are > rolled back. What state transition of pg_subscription_rel entries for sequences do we need while copying sequences values? For example, we insert an entry with 'init' state at CREATE SUBSCRIPTION and then the sequence-sync worker updates to 'ready' and copies the sequence data. And at REFRESH PUBLICATION SEQUENCES, we update the state back to 'init' again so that the sequence-sync worker can process it? Given REFRESH PUBLICATION SEQUENCES won't be executed very frequently, it might be acceptable to transactionally update sequence values. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Yeah, starting with a single worker sounds good for now. Do you think > > we should sync all the sequences in a single transaction or have some > > threshold value above which a different transaction would be required > > or maybe a different sequence sync worker altogether? Now, having > > multiple sequence-sync workers requires some synchronization so that > > only a single worker is allocated for one sequence. > > > > The simplest thing is to use a single sequence sync worker that syncs > > all sequences in one transaction but with a large number of sequences, > > it could be inefficient. OTOH, I am not sure if it would be a problem > > in reality. > > I think that we can start with using a single worker and one > transaction, and measure the performance with a large number of > sequences. > Fair enough. However, this raises the question Dilip and Vignesh are discussing whether we need a new relfilenode for sequence update even during initial sync? As per my understanding, the idea is that similar to tables, the CREATE SUBSCRIPTION command (with copy_data = true) will create the new sequence entries in pg_subscription_rel with the state as 'i'. Then the sequence-sync worker would start a transaction and one-by-one copy the latest sequence values for each sequence (that has state as 'i' in pg_subscription_rel) and mark its state as ready 'r' and commit the transaction. Now if there is an error during this operation it will restart the entire operation. The idea of creating a new relfilenode is to handle the error so that if there is a rollback, the sequence state will be rolled back to 'i' and the sequence value will also be rolled back. The other option could be that we update the sequence value without a new relfilenode and if the transaction rolled back then only the sequence's state will be rolled back to 'i'. This would work with a minor inconsistency that sequence values will be up-to-date even when the sequence state is 'i' in pg_subscription_rel. I am not sure if that matters because anyway, they can quickly be out-of-sync with the publisher again. Now, say we don't want to maintain the state of sequences for initial sync at all then after the error how will we detect if there are any pending sequences to be synced? One possibility is that we maintain a subscription level flag 'subsequencesync' in 'pg_subscription' to indicate whether sequences need sync. This flag would indicate whether to sync all the sequences in pg_susbcription_rel. This would mean that if there is an error while syncing the sequences we will resync all the sequences again. This could be acceptable considering the chances of error during sequence sync are low. The benefit is that both the REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same idea and sync all sequences without needing a new relfilenode. Users can always refer 'subsequencesync' flag in 'pg_subscription' to see if all the sequences are synced after executing the command. > > > Or yet another idea I came up with is that a tablesync worker will > > > synchronize both the table and sequences owned by the table. That is, > > > after the tablesync worker caught up with the apply worker, the > > > tablesync worker synchronizes sequences associated with the target > > > table as well. One benefit would be that at the time of initial table > > > sync being completed, the table and its sequence data are consistent. > > Correction; it's not guaranteed that the sequence data and table data > are consistent even in this case since the tablesync worker could get > on-disk sequence data that might have already been updated. > The benefit of this approach is not clear to me. Our aim is to sync all sequences before the upgrade, so not sure if this helps because anyway both table values and corresponding sequences can again be out-of-sync very quickly. > > > > > > > > > > 3) Refreshing the sequence can be achieved through the existing > > > > command: ALTER SUBSCRIPTION ... REFRESH PUBLICATION(no syntax change > > > > here). > > > > The subscriber identifies stale sequences, meaning sequences present > > > > in pg_subscription_rel but absent from the publication, and removes > > > > them from the pg_subscription_rel system table. The subscriber also > > > > checks for newly added sequences in the publisher and synchronizes > > > > their values from the publisher using the steps outlined in the > > > > subscription creation process. It's worth noting that previously > > > > synchronized sequences won't be synchronized again; the sequence sync > > > > will occur solely for the newly added sequences. > > > > > > > > 4) Introducing a new command for refreshing all sequences: ALTER > > > > SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES. > > > > The subscriber will remove stale sequences and add newly added > > > > sequences from the publisher. Following this, it will re-synchronize > > > > the sequence values for all sequences in the updated list from the > > > > publisher, following the steps outlined in the subscription creation > > > > process. > > > > > > The difference between 3) and 4) is whether or not to re-synchronize > > > the previously synchronized sequences. Do we really want to introduce > > > a new command for 4)? I felt that we can invent an option say > > > copy_all_sequence for the REFRESH PUBLICATION command to cover the 4) > > > case. > > > > > > > Yeah, that is also an option but it could confuse along with copy_data > > option. Say the user has selected copy_data = false but > > copy_all_sequences = true then the first option indicates to *not* > > copy the data of table and sequences and the second option indicates > > to copy the sequences data which sounds contradictory. The other idea > > is to have an option copy_existing_sequences (which indicates to copy > > existing sequence values) but that also has somewhat the same drawback > > as copy_all_sequences but to a lesser degree. > > Good point. And I understood that the REFRESH PUBLICATION SEQUENCES > command would be helpful when users want to synchronize sequences > between two nodes before upgrading. > Right. > > > > > > > > > > 5) Incorporate the pg_sequence_state function to fetch the sequence > > > > value from the publisher, along with the page LSN. Incorporate > > > > SetSequence function, which will procure a new relfilenode for the > > > > sequence and set the new relfilenode with the specified value. This > > > > will facilitate rollback in case of any failures. > > > > > > Does it mean that we create a new relfilenode for every update of the value? > > > > > > > We need it for initial sync so that if there is an error both the > > sequence state in pg_subscription_rel and sequence values can be > > rolled back together. > > Agreed. > > > However, it is unclear whether we need to create > > a new relfilenode while copying existing sequences (say during ALTER > > SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we > > decide)? Probably the answer lies in how we want to implement this > > command. If we want to copy all sequence values during the command > > itself then it is probably okay but if we want to handover this task > > to the sequence-sync worker then we need some state management and a > > new relfilenode so that on error both state and sequence values are > > rolled back. > > What state transition of pg_subscription_rel entries for sequences do > we need while copying sequences values? For example, we insert an > entry with 'init' state at CREATE SUBSCRIPTION and then the > sequence-sync worker updates to 'ready' and copies the sequence data. > And at REFRESH PUBLICATION SEQUENCES, we update the state back to > 'init' again so that the sequence-sync worker can process it? Given > REFRESH PUBLICATION SEQUENCES won't be executed very frequently, it > might be acceptable to transactionally update sequence values. > Do you mean that sync the sequences during the REFRESH PUBLICATION SEQUENCES command itself? If so, there is an argument that we can do the same during CREATE SUBSCRIPTION. It would be beneficial to keep the method to sync the sequences same for both the CREATE and REFRESH commands. I have speculated on one idea above and would be happy to see your thoughts. -- With Regards, Amit Kapila.
On Thu, Jun 13, 2024 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > Yeah, starting with a single worker sounds good for now. Do you think > > > we should sync all the sequences in a single transaction or have some > > > threshold value above which a different transaction would be required > > > or maybe a different sequence sync worker altogether? Now, having > > > multiple sequence-sync workers requires some synchronization so that > > > only a single worker is allocated for one sequence. > > > > > > The simplest thing is to use a single sequence sync worker that syncs > > > all sequences in one transaction but with a large number of sequences, > > > it could be inefficient. OTOH, I am not sure if it would be a problem > > > in reality. > > > > I think that we can start with using a single worker and one > > transaction, and measure the performance with a large number of > > sequences. > > > > Fair enough. However, this raises the question Dilip and Vignesh are > discussing whether we need a new relfilenode for sequence update even > during initial sync? As per my understanding, the idea is that similar > to tables, the CREATE SUBSCRIPTION command (with copy_data = true) > will create the new sequence entries in pg_subscription_rel with the > state as 'i'. Then the sequence-sync worker would start a transaction > and one-by-one copy the latest sequence values for each sequence (that > has state as 'i' in pg_subscription_rel) and mark its state as ready > 'r' and commit the transaction. Now if there is an error during this > operation it will restart the entire operation. The idea of creating a > new relfilenode is to handle the error so that if there is a rollback, > the sequence state will be rolled back to 'i' and the sequence value > will also be rolled back. The other option could be that we update the > sequence value without a new relfilenode and if the transaction rolled > back then only the sequence's state will be rolled back to 'i'. This > would work with a minor inconsistency that sequence values will be > up-to-date even when the sequence state is 'i' in pg_subscription_rel. > I am not sure if that matters because anyway, they can quickly be > out-of-sync with the publisher again. I think it would be fine in many cases even if the sequence value is up-to-date even when the sequence state is 'i' in pg_subscription_rel. But the case we would like to avoid is where suppose the sequence-sync worker does both synchronizing sequence values and updating the sequence states for all sequences in one transaction, and if there is an error we end up retrying the synchronization for all sequences. > > Now, say we don't want to maintain the state of sequences for initial > sync at all then after the error how will we detect if there are any > pending sequences to be synced? One possibility is that we maintain a > subscription level flag 'subsequencesync' in 'pg_subscription' to > indicate whether sequences need sync. This flag would indicate whether > to sync all the sequences in pg_susbcription_rel. This would mean that > if there is an error while syncing the sequences we will resync all > the sequences again. This could be acceptable considering the chances > of error during sequence sync are low. The benefit is that both the > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same > idea and sync all sequences without needing a new relfilenode. Users > can always refer 'subsequencesync' flag in 'pg_subscription' to see if > all the sequences are synced after executing the command. I think that REFRESH PUBLICATION {SEQUENCES} can be executed even while the sequence-sync worker is synchronizing sequences. In this case, the worker might not see new sequences added by the concurrent REFRESH PUBLICATION {SEQUENCES} command since it's already running. The worker could end up marking the subsequencesync as completed while not synchronizing these new sequences. > > > > > Or yet another idea I came up with is that a tablesync worker will > > > > synchronize both the table and sequences owned by the table. That is, > > > > after the tablesync worker caught up with the apply worker, the > > > > tablesync worker synchronizes sequences associated with the target > > > > table as well. One benefit would be that at the time of initial table > > > > sync being completed, the table and its sequence data are consistent. > > > > Correction; it's not guaranteed that the sequence data and table data > > are consistent even in this case since the tablesync worker could get > > on-disk sequence data that might have already been updated. > > > > The benefit of this approach is not clear to me. Our aim is to sync > all sequences before the upgrade, so not sure if this helps because > anyway both table values and corresponding sequences can again be > out-of-sync very quickly. Right. Given that our aim is to sync all sequences before the upgrade, do we need to synchronize sequences even at CREATE SUBSCRIPTION time? In cases where there are a large number of sequences, synchronizing sequences in addition to tables could be overhead and make less sense, because sequences can again be out-of-sync quickly and typically CREATE SUBSCRIPTION is not created just before the upgrade. > > > > > > > > > > > > > > 5) Incorporate the pg_sequence_state function to fetch the sequence > > > > > value from the publisher, along with the page LSN. Incorporate > > > > > SetSequence function, which will procure a new relfilenode for the > > > > > sequence and set the new relfilenode with the specified value. This > > > > > will facilitate rollback in case of any failures. > > > > > > > > Does it mean that we create a new relfilenode for every update of the value? > > > > > > > > > > We need it for initial sync so that if there is an error both the > > > sequence state in pg_subscription_rel and sequence values can be > > > rolled back together. > > > > Agreed. > > > > > However, it is unclear whether we need to create > > > a new relfilenode while copying existing sequences (say during ALTER > > > SUBSCRIPTION .. REFRESH PUBLICATION SEQUENCES, or whatever command we > > > decide)? Probably the answer lies in how we want to implement this > > > command. If we want to copy all sequence values during the command > > > itself then it is probably okay but if we want to handover this task > > > to the sequence-sync worker then we need some state management and a > > > new relfilenode so that on error both state and sequence values are > > > rolled back. > > > > What state transition of pg_subscription_rel entries for sequences do > > we need while copying sequences values? For example, we insert an > > entry with 'init' state at CREATE SUBSCRIPTION and then the > > sequence-sync worker updates to 'ready' and copies the sequence data. > > And at REFRESH PUBLICATION SEQUENCES, we update the state back to > > 'init' again so that the sequence-sync worker can process it? Given > > REFRESH PUBLICATION SEQUENCES won't be executed very frequently, it > > might be acceptable to transactionally update sequence values. > > > > Do you mean that sync the sequences during the REFRESH PUBLICATION > SEQUENCES command itself? If so, there is an argument that we can do > the same during CREATE SUBSCRIPTION. It would be beneficial to keep > the method to sync the sequences same for both the CREATE and REFRESH > commands. I have speculated on one idea above and would be happy to > see your thoughts. I meant that the REFRESH PUBLICATION SEQUENCES command updates all sequence states in pg_subscription_rel to 'init' state, and the sequence-sync worker can do the synchronization work. We use the same method for both the CREATE SUBSCRIPTION and REFRESH PUBLICATION {SEQUENCES} commands. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Thu, Jun 13, 2024 at 03:36:05PM +0530, Amit Kapila wrote: > Fair enough. However, this raises the question Dilip and Vignesh are > discussing whether we need a new relfilenode for sequence update even > during initial sync? As per my understanding, the idea is that similar > to tables, the CREATE SUBSCRIPTION command (with copy_data = true) > will create the new sequence entries in pg_subscription_rel with the > state as 'i'. Then the sequence-sync worker would start a transaction > and one-by-one copy the latest sequence values for each sequence (that > has state as 'i' in pg_subscription_rel) and mark its state as ready > 'r' and commit the transaction. Now if there is an error during this > operation it will restart the entire operation. Hmm. You mean to use only one transaction for all the sequences? I've heard about deployments with a lot of them. Could it be a problem to process them in batches, as well? If you maintain a state for each one of them in pg_subscription_rel, it does not strike me as an issue, while being more flexible than an all-or-nothing. > The idea of creating a > new relfilenode is to handle the error so that if there is a rollback, > the sequence state will be rolled back to 'i' and the sequence value > will also be rolled back. The other option could be that we update the > sequence value without a new relfilenode and if the transaction rolled > back then only the sequence's state will be rolled back to 'i'. This > would work with a minor inconsistency that sequence values will be > up-to-date even when the sequence state is 'i' in pg_subscription_rel. > I am not sure if that matters because anyway, they can quickly be > out-of-sync with the publisher again. Seeing a mention to relfilenodes specifically for sequences freaks me out a bit, because there's some work I have been doing in this area and sequences may not have a need for a physical relfilenode at all. But I guess that you refer to the fact that like tables, relfilenodes would only be created as required because anything you'd do in the apply worker path would just call some of the routines of sequence.h, right? > Now, say we don't want to maintain the state of sequences for initial > sync at all then after the error how will we detect if there are any > pending sequences to be synced? One possibility is that we maintain a > subscription level flag 'subsequencesync' in 'pg_subscription' to > indicate whether sequences need sync. This flag would indicate whether > to sync all the sequences in pg_susbcription_rel. This would mean that > if there is an error while syncing the sequences we will resync all > the sequences again. This could be acceptable considering the chances > of error during sequence sync are low. There could be multiple subscriptions to a single database that point to the same set of sequences. Is there any conflict issue to worry about here? > The benefit is that both the > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same > idea and sync all sequences without needing a new relfilenode. Users > can always refer 'subsequencesync' flag in 'pg_subscription' to see if > all the sequences are synced after executing the command. That would be cheaper, indeed. Isn't a boolean too limiting? Isn't that something you'd want to track with a LSN as "the point in WAL where all the sequences have been synced"? The approach of doing all the sync work from the subscriber, while having a command that can be kicked from the subscriber side is a good user experience. -- Michael
Вложения
On Thu, Jun 13, 2024 at 6:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Thu, Jun 13, 2024 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > Yeah, starting with a single worker sounds good for now. Do you think > > > > we should sync all the sequences in a single transaction or have some > > > > threshold value above which a different transaction would be required > > > > or maybe a different sequence sync worker altogether? Now, having > > > > multiple sequence-sync workers requires some synchronization so that > > > > only a single worker is allocated for one sequence. > > > > > > > > The simplest thing is to use a single sequence sync worker that syncs > > > > all sequences in one transaction but with a large number of sequences, > > > > it could be inefficient. OTOH, I am not sure if it would be a problem > > > > in reality. > > > > > > I think that we can start with using a single worker and one > > > transaction, and measure the performance with a large number of > > > sequences. > > > > > > > Fair enough. However, this raises the question Dilip and Vignesh are > > discussing whether we need a new relfilenode for sequence update even > > during initial sync? As per my understanding, the idea is that similar > > to tables, the CREATE SUBSCRIPTION command (with copy_data = true) > > will create the new sequence entries in pg_subscription_rel with the > > state as 'i'. Then the sequence-sync worker would start a transaction > > and one-by-one copy the latest sequence values for each sequence (that > > has state as 'i' in pg_subscription_rel) and mark its state as ready > > 'r' and commit the transaction. Now if there is an error during this > > operation it will restart the entire operation. The idea of creating a > > new relfilenode is to handle the error so that if there is a rollback, > > the sequence state will be rolled back to 'i' and the sequence value > > will also be rolled back. The other option could be that we update the > > sequence value without a new relfilenode and if the transaction rolled > > back then only the sequence's state will be rolled back to 'i'. This > > would work with a minor inconsistency that sequence values will be > > up-to-date even when the sequence state is 'i' in pg_subscription_rel. > > I am not sure if that matters because anyway, they can quickly be > > out-of-sync with the publisher again. > > I think it would be fine in many cases even if the sequence value is > up-to-date even when the sequence state is 'i' in pg_subscription_rel. > But the case we would like to avoid is where suppose the sequence-sync > worker does both synchronizing sequence values and updating the > sequence states for all sequences in one transaction, and if there is > an error we end up retrying the synchronization for all sequences. > The one idea to avoid this is to update sequences in chunks (say 100 or some threshold number of sequences in one transaction). Then we would only redo the sync for the last and pending set of sequences. > > > > Now, say we don't want to maintain the state of sequences for initial > > sync at all then after the error how will we detect if there are any > > pending sequences to be synced? One possibility is that we maintain a > > subscription level flag 'subsequencesync' in 'pg_subscription' to > > indicate whether sequences need sync. This flag would indicate whether > > to sync all the sequences in pg_susbcription_rel. This would mean that > > if there is an error while syncing the sequences we will resync all > > the sequences again. This could be acceptable considering the chances > > of error during sequence sync are low. The benefit is that both the > > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same > > idea and sync all sequences without needing a new relfilenode. Users > > can always refer 'subsequencesync' flag in 'pg_subscription' to see if > > all the sequences are synced after executing the command. > > I think that REFRESH PUBLICATION {SEQUENCES} can be executed even > while the sequence-sync worker is synchronizing sequences. In this > case, the worker might not see new sequences added by the concurrent > REFRESH PUBLICATION {SEQUENCES} command since it's already running. > The worker could end up marking the subsequencesync as completed while > not synchronizing these new sequences. > This is possible but we could avoid REFRESH PUBLICATION {SEQUENCES} by not allowing to change the subsequencestate during the time sequence-worker is syncing the sequences. This could be restrictive but there doesn't seem to be cases where user would like to immediately refresh sequences after creating the subscription. > > > > > > > Or yet another idea I came up with is that a tablesync worker will > > > > > synchronize both the table and sequences owned by the table. That is, > > > > > after the tablesync worker caught up with the apply worker, the > > > > > tablesync worker synchronizes sequences associated with the target > > > > > table as well. One benefit would be that at the time of initial table > > > > > sync being completed, the table and its sequence data are consistent. > > > > > > Correction; it's not guaranteed that the sequence data and table data > > > are consistent even in this case since the tablesync worker could get > > > on-disk sequence data that might have already been updated. > > > > > > > The benefit of this approach is not clear to me. Our aim is to sync > > all sequences before the upgrade, so not sure if this helps because > > anyway both table values and corresponding sequences can again be > > out-of-sync very quickly. > > Right. > > Given that our aim is to sync all sequences before the upgrade, do we > need to synchronize sequences even at CREATE SUBSCRIPTION time? In > cases where there are a large number of sequences, synchronizing > sequences in addition to tables could be overhead and make less sense, > because sequences can again be out-of-sync quickly and typically > CREATE SUBSCRIPTION is not created just before the upgrade. > I think for the upgrade one should be creating a subscription just before the upgrade. Isn't something similar is done even in the upgrade steps you shared once [1]? Typically users should get all the data from the publisher before the upgrade of the publisher via creating a subscription. Also, it would be better to keep the implementation of sequences close to tables wherever possible. Having said that, I understand your point as well and if you strongly feel that we don't need to sync sequences at the time of CREATE SUBSCRIPTION and others also don't see any problem with it then we can consider that as well. > > > > > > > Do you mean that sync the sequences during the REFRESH PUBLICATION > > SEQUENCES command itself? If so, there is an argument that we can do > > the same during CREATE SUBSCRIPTION. It would be beneficial to keep > > the method to sync the sequences same for both the CREATE and REFRESH > > commands. I have speculated on one idea above and would be happy to > > see your thoughts. > > I meant that the REFRESH PUBLICATION SEQUENCES command updates all > sequence states in pg_subscription_rel to 'init' state, and the > sequence-sync worker can do the synchronization work. We use the same > method for both the CREATE SUBSCRIPTION and REFRESH PUBLICATION > {SEQUENCES} commands. > Marking the state as 'init' when we would have already synced the sequences sounds a bit odd but otherwise, this could also work if we accept that even if the sequences are synced and value could remain in 'init' state (on rollbacks). [1] - https://knock.app/blog/zero-downtime-postgres-upgrades -- With Regards, Amit Kapila.
On Fri, Jun 14, 2024 at 5:16 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Jun 13, 2024 at 03:36:05PM +0530, Amit Kapila wrote: > > Fair enough. However, this raises the question Dilip and Vignesh are > > discussing whether we need a new relfilenode for sequence update even > > during initial sync? As per my understanding, the idea is that similar > > to tables, the CREATE SUBSCRIPTION command (with copy_data = true) > > will create the new sequence entries in pg_subscription_rel with the > > state as 'i'. Then the sequence-sync worker would start a transaction > > and one-by-one copy the latest sequence values for each sequence (that > > has state as 'i' in pg_subscription_rel) and mark its state as ready > > 'r' and commit the transaction. Now if there is an error during this > > operation it will restart the entire operation. > > Hmm. You mean to use only one transaction for all the sequences? > I've heard about deployments with a lot of them. Could it be a > problem to process them in batches, as well? I don't think so. We can even sync one sequence per transaction but then it would be resource and time consuming without much gain. As mentioned in a previous email, we might want to sync 100 or some other threshold number of sequences per transaction. The other possibility is to make a subscription-level option for this batch size but I don't see much advantage in doing so as it won't be convenient for users to set it. I feel we should pick some threshold number that is neither too low nor too high and if we later see any problem with it, we can make it a configurable knob. > > > The idea of creating a > > new relfilenode is to handle the error so that if there is a rollback, > > the sequence state will be rolled back to 'i' and the sequence value > > will also be rolled back. The other option could be that we update the > > sequence value without a new relfilenode and if the transaction rolled > > back then only the sequence's state will be rolled back to 'i'. This > > would work with a minor inconsistency that sequence values will be > > up-to-date even when the sequence state is 'i' in pg_subscription_rel. > > I am not sure if that matters because anyway, they can quickly be > > out-of-sync with the publisher again. > > Seeing a mention to relfilenodes specifically for sequences freaks me > out a bit, because there's some work I have been doing in this area > and sequences may not have a need for a physical relfilenode at all. > But I guess that you refer to the fact that like tables, relfilenodes > would only be created as required because anything you'd do in the > apply worker path would just call some of the routines of sequence.h, > right? > Yes, I think so. The only thing the patch expects is a way to rollback the sequence changes if the transaction rolls back during the initial sync. But I am not sure if we need such a behavior. The discussion for the same is in progress. Let's wait for the outcome. > > Now, say we don't want to maintain the state of sequences for initial > > sync at all then after the error how will we detect if there are any > > pending sequences to be synced? One possibility is that we maintain a > > subscription level flag 'subsequencesync' in 'pg_subscription' to > > indicate whether sequences need sync. This flag would indicate whether > > to sync all the sequences in pg_susbcription_rel. This would mean that > > if there is an error while syncing the sequences we will resync all > > the sequences again. This could be acceptable considering the chances > > of error during sequence sync are low. > > There could be multiple subscriptions to a single database that point > to the same set of sequences. Is there any conflict issue to worry > about here? > I don't think so. In the worst case, the same value would be copied twice. The same scenario in case of tables could lead to duplicate data or unique key violation ERRORs which is much worse. So, I expect users to be careful about the same. > > The benefit is that both the > > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same > > idea and sync all sequences without needing a new relfilenode. Users > > can always refer 'subsequencesync' flag in 'pg_subscription' to see if > > all the sequences are synced after executing the command. > > That would be cheaper, indeed. Isn't a boolean too limiting? > In this idea, we only need a flag to say whether the sequence sync is required or not. > Isn't that something you'd want to track with a LSN as "the point in > WAL where all the sequences have been synced"? > It won't be any better for the required purpose because after CREATE SUBSCRIPTION, if REFERESH wants to toggle the flag to indicate the sequences need sync again then using LSN would mean we need to set it to Invalid value. > The approach of doing all the sync work from the subscriber, while > having a command that can be kicked from the subscriber side is a good > user experience. > Thank you for endorsing the idea. -- With Regards, Amit Kapila.
On Fri, Jun 14, 2024 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Jun 13, 2024 at 6:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Thu, Jun 13, 2024 at 7:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Thu, Jun 13, 2024 at 1:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > On Wed, Jun 12, 2024 at 6:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > > > > Yeah, starting with a single worker sounds good for now. Do you think > > > > > we should sync all the sequences in a single transaction or have some > > > > > threshold value above which a different transaction would be required > > > > > or maybe a different sequence sync worker altogether? Now, having > > > > > multiple sequence-sync workers requires some synchronization so that > > > > > only a single worker is allocated for one sequence. > > > > > > > > > > The simplest thing is to use a single sequence sync worker that syncs > > > > > all sequences in one transaction but with a large number of sequences, > > > > > it could be inefficient. OTOH, I am not sure if it would be a problem > > > > > in reality. > > > > > > > > I think that we can start with using a single worker and one > > > > transaction, and measure the performance with a large number of > > > > sequences. > > > > > > > > > > Fair enough. However, this raises the question Dilip and Vignesh are > > > discussing whether we need a new relfilenode for sequence update even > > > during initial sync? As per my understanding, the idea is that similar > > > to tables, the CREATE SUBSCRIPTION command (with copy_data = true) > > > will create the new sequence entries in pg_subscription_rel with the > > > state as 'i'. Then the sequence-sync worker would start a transaction > > > and one-by-one copy the latest sequence values for each sequence (that > > > has state as 'i' in pg_subscription_rel) and mark its state as ready > > > 'r' and commit the transaction. Now if there is an error during this > > > operation it will restart the entire operation. The idea of creating a > > > new relfilenode is to handle the error so that if there is a rollback, > > > the sequence state will be rolled back to 'i' and the sequence value > > > will also be rolled back. The other option could be that we update the > > > sequence value without a new relfilenode and if the transaction rolled > > > back then only the sequence's state will be rolled back to 'i'. This > > > would work with a minor inconsistency that sequence values will be > > > up-to-date even when the sequence state is 'i' in pg_subscription_rel. > > > I am not sure if that matters because anyway, they can quickly be > > > out-of-sync with the publisher again. > > > > I think it would be fine in many cases even if the sequence value is > > up-to-date even when the sequence state is 'i' in pg_subscription_rel. > > But the case we would like to avoid is where suppose the sequence-sync > > worker does both synchronizing sequence values and updating the > > sequence states for all sequences in one transaction, and if there is > > an error we end up retrying the synchronization for all sequences. > > > > The one idea to avoid this is to update sequences in chunks (say 100 > or some threshold number of sequences in one transaction). Then we > would only redo the sync for the last and pending set of sequences. That could be one idea. > > > > > > > Now, say we don't want to maintain the state of sequences for initial > > > sync at all then after the error how will we detect if there are any > > > pending sequences to be synced? One possibility is that we maintain a > > > subscription level flag 'subsequencesync' in 'pg_subscription' to > > > indicate whether sequences need sync. This flag would indicate whether > > > to sync all the sequences in pg_susbcription_rel. This would mean that > > > if there is an error while syncing the sequences we will resync all > > > the sequences again. This could be acceptable considering the chances > > > of error during sequence sync are low. The benefit is that both the > > > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same > > > idea and sync all sequences without needing a new relfilenode. Users > > > can always refer 'subsequencesync' flag in 'pg_subscription' to see if > > > all the sequences are synced after executing the command. > > > > I think that REFRESH PUBLICATION {SEQUENCES} can be executed even > > while the sequence-sync worker is synchronizing sequences. In this > > case, the worker might not see new sequences added by the concurrent > > REFRESH PUBLICATION {SEQUENCES} command since it's already running. > > The worker could end up marking the subsequencesync as completed while > > not synchronizing these new sequences. > > > > This is possible but we could avoid REFRESH PUBLICATION {SEQUENCES} by > not allowing to change the subsequencestate during the time > sequence-worker is syncing the sequences. This could be restrictive > but there doesn't seem to be cases where user would like to > immediately refresh sequences after creating the subscription. I'm concerned that users would not be able to add sequences during the time the sequence-worker is syncing the sequences. For example, suppose we have 10000 sequences and execute REFRESH PUBLICATION {SEQUENCES} to synchronize 10000 sequences. Now if we add one sequence to the publication and want to synchronize it to the subscriber, we have to wait for the current REFRESH PUBLICATION {SEQUENCES} to complete, and then execute it again, synchronizing 10001 sequences, instead of synchronizing only the new one. > > > > > > > > > > Or yet another idea I came up with is that a tablesync worker will > > > > > > synchronize both the table and sequences owned by the table. That is, > > > > > > after the tablesync worker caught up with the apply worker, the > > > > > > tablesync worker synchronizes sequences associated with the target > > > > > > table as well. One benefit would be that at the time of initial table > > > > > > sync being completed, the table and its sequence data are consistent. > > > > > > > > Correction; it's not guaranteed that the sequence data and table data > > > > are consistent even in this case since the tablesync worker could get > > > > on-disk sequence data that might have already been updated. > > > > > > > > > > The benefit of this approach is not clear to me. Our aim is to sync > > > all sequences before the upgrade, so not sure if this helps because > > > anyway both table values and corresponding sequences can again be > > > out-of-sync very quickly. > > > > Right. > > > > Given that our aim is to sync all sequences before the upgrade, do we > > need to synchronize sequences even at CREATE SUBSCRIPTION time? In > > cases where there are a large number of sequences, synchronizing > > sequences in addition to tables could be overhead and make less sense, > > because sequences can again be out-of-sync quickly and typically > > CREATE SUBSCRIPTION is not created just before the upgrade. > > > > I think for the upgrade one should be creating a subscription just > before the upgrade. Isn't something similar is done even in the > upgrade steps you shared once [1]? I might be missing something but in the blog post they created subscriptions in various ways, waited for the initial table data sync to complete, and then set the sequence values with a buffer based on the old cluster. What I imagined with this sequence synchronization feature is that after the initial table sync completes, we stop to execute further transactions on the publisher, synchronize sequences using REFRESH PUBLICATION {SEQUENCES}, and resume the application to execute transactions on the subscriber. So a subscription would be created just before the upgrade, but sequence synchronization would not necessarily happen at the same time of the initial table data synchronization. > Typically users should get all the > data from the publisher before the upgrade of the publisher via > creating a subscription. Also, it would be better to keep the > implementation of sequences close to tables wherever possible. Having > said that, I understand your point as well and if you strongly feel > that we don't need to sync sequences at the time of CREATE > SUBSCRIPTION and others also don't see any problem with it then we can > consider that as well. I see your point that it's better to keep the implementation of sequences close to the table one. So I agree that we can start with this approach, and we will see how it works in practice and consider other options later. > > > > > > > > > > > Do you mean that sync the sequences during the REFRESH PUBLICATION > > > SEQUENCES command itself? If so, there is an argument that we can do > > > the same during CREATE SUBSCRIPTION. It would be beneficial to keep > > > the method to sync the sequences same for both the CREATE and REFRESH > > > commands. I have speculated on one idea above and would be happy to > > > see your thoughts. > > > > I meant that the REFRESH PUBLICATION SEQUENCES command updates all > > sequence states in pg_subscription_rel to 'init' state, and the > > sequence-sync worker can do the synchronization work. We use the same > > method for both the CREATE SUBSCRIPTION and REFRESH PUBLICATION > > {SEQUENCES} commands. > > > > Marking the state as 'init' when we would have already synced the > sequences sounds a bit odd but otherwise, this could also work if we > accept that even if the sequences are synced and value could remain in > 'init' state (on rollbacks). I mean that it's just for identifying sequences that need to be synced. With the idea of using sequence states in pg_subscription_rel, the REFRESH PUBLICATION SEQUENCES command needs to change states to something so that the sequence-sync worker can identify which sequence needs to be synced. If 'init' sounds odd, we can invent a new state for sequences, say 'needs-to-be-syned'. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
On Tue, Jun 18, 2024 at 7:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Fri, Jun 14, 2024 at 4:04 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Jun 13, 2024 at 6:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > > > > > > > > > > Now, say we don't want to maintain the state of sequences for initial > > > > sync at all then after the error how will we detect if there are any > > > > pending sequences to be synced? One possibility is that we maintain a > > > > subscription level flag 'subsequencesync' in 'pg_subscription' to > > > > indicate whether sequences need sync. This flag would indicate whether > > > > to sync all the sequences in pg_susbcription_rel. This would mean that > > > > if there is an error while syncing the sequences we will resync all > > > > the sequences again. This could be acceptable considering the chances > > > > of error during sequence sync are low. The benefit is that both the > > > > REFRESH PUBLICATION SEQUENCES and CREATE SUBSCRIPTION can use the same > > > > idea and sync all sequences without needing a new relfilenode. Users > > > > can always refer 'subsequencesync' flag in 'pg_subscription' to see if > > > > all the sequences are synced after executing the command. > > > > > > I think that REFRESH PUBLICATION {SEQUENCES} can be executed even > > > while the sequence-sync worker is synchronizing sequences. In this > > > case, the worker might not see new sequences added by the concurrent > > > REFRESH PUBLICATION {SEQUENCES} command since it's already running. > > > The worker could end up marking the subsequencesync as completed while > > > not synchronizing these new sequences. > > > > > > > This is possible but we could avoid REFRESH PUBLICATION {SEQUENCES} by > > not allowing to change the subsequencestate during the time > > sequence-worker is syncing the sequences. This could be restrictive > > but there doesn't seem to be cases where user would like to > > immediately refresh sequences after creating the subscription. > > I'm concerned that users would not be able to add sequences during the > time the sequence-worker is syncing the sequences. For example, > suppose we have 10000 sequences and execute REFRESH PUBLICATION > {SEQUENCES} to synchronize 10000 sequences. Now if we add one sequence > to the publication and want to synchronize it to the subscriber, we > have to wait for the current REFRESH PUBLICATION {SEQUENCES} to > complete, and then execute it again, synchronizing 10001 sequences, > instead of synchronizing only the new one. > I see your point and it could hurt such scenarios even though they won't be frequent. So, let's focus on our other approach of maintaining the flag at a per-sequence level in pg_subscription_rel. > > > > > > > > > > > > > Or yet another idea I came up with is that a tablesync worker will > > > > > > > synchronize both the table and sequences owned by the table. That is, > > > > > > > after the tablesync worker caught up with the apply worker, the > > > > > > > tablesync worker synchronizes sequences associated with the target > > > > > > > table as well. One benefit would be that at the time of initial table > > > > > > > sync being completed, the table and its sequence data are consistent. > > > > > > > > > > Correction; it's not guaranteed that the sequence data and table data > > > > > are consistent even in this case since the tablesync worker could get > > > > > on-disk sequence data that might have already been updated. > > > > > > > > > > > > > The benefit of this approach is not clear to me. Our aim is to sync > > > > all sequences before the upgrade, so not sure if this helps because > > > > anyway both table values and corresponding sequences can again be > > > > out-of-sync very quickly. > > > > > > Right. > > > > > > Given that our aim is to sync all sequences before the upgrade, do we > > > need to synchronize sequences even at CREATE SUBSCRIPTION time? In > > > cases where there are a large number of sequences, synchronizing > > > sequences in addition to tables could be overhead and make less sense, > > > because sequences can again be out-of-sync quickly and typically > > > CREATE SUBSCRIPTION is not created just before the upgrade. > > > > > > > I think for the upgrade one should be creating a subscription just > > before the upgrade. Isn't something similar is done even in the > > upgrade steps you shared once [1]? > > I might be missing something but in the blog post they created > subscriptions in various ways, waited for the initial table data sync > to complete, and then set the sequence values with a buffer based on > the old cluster. What I imagined with this sequence synchronization > feature is that after the initial table sync completes, we stop to > execute further transactions on the publisher, synchronize sequences > using REFRESH PUBLICATION {SEQUENCES}, and resume the application to > execute transactions on the subscriber. So a subscription would be > created just before the upgrade, but sequence synchronization would > not necessarily happen at the same time of the initial table data > synchronization. > It depends on the exact steps of the upgrade. For example, if one stops the publisher before adding sequences to a subscription either via create subscription or alter subscription add/set command then there won't be a need for a separate refresh but OTOH, if one follows the steps you mentioned then the refresh would be required. As you are okay, with syncing the sequences while creating a subscription in the below part of the email, there is not much point in arguing about this further. > > Typically users should get all the > > data from the publisher before the upgrade of the publisher via > > creating a subscription. Also, it would be better to keep the > > implementation of sequences close to tables wherever possible. Having > > said that, I understand your point as well and if you strongly feel > > that we don't need to sync sequences at the time of CREATE > > SUBSCRIPTION and others also don't see any problem with it then we can > > consider that as well. > > I see your point that it's better to keep the implementation of > sequences close to the table one. So I agree that we can start with > this approach, and we will see how it works in practice and consider > other options later. > makes sense. > > > > Marking the state as 'init' when we would have already synced the > > sequences sounds a bit odd but otherwise, this could also work if we > > accept that even if the sequences are synced and value could remain in > > 'init' state (on rollbacks). > > I mean that it's just for identifying sequences that need to be > synced. With the idea of using sequence states in pg_subscription_rel, > the REFRESH PUBLICATION SEQUENCES command needs to change states to > something so that the sequence-sync worker can identify which sequence > needs to be synced. If 'init' sounds odd, we can invent a new state > for sequences, say 'needs-to-be-syned'. > Agreed and I am not sure which is better because there is a value in keeping the state name the same for both sequences and tables. We probably need more comments in code and doc updates to make the behavior clear. We can start with the sequence state as 'init' for 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others feel so during the review. -- With Regards, Amit Kapila.
On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > Agreed and I am not sure which is better because there is a value in > keeping the state name the same for both sequences and tables. We > probably need more comments in code and doc updates to make the > behavior clear. We can start with the sequence state as 'init' for > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others > feel so during the review. Here is a patch which does the sequence synchronization in the following lines from the above discussion: This commit introduces sequence synchronization during 1) creation of subscription for initial sync of sequences 2) refresh publication to synchronize the sequences for the newly created sequences 3) refresh publication sequences for synchronizing all the sequences. 1) During subscription creation with CREATE SUBSCRIPTION (no syntax change): - The subscriber retrieves sequences associated with publications. - Sequences are added in the 'init' state to the pg_subscription_rel table. - Sequence synchronization worker will be started if there are any sequences to be synchronized - A new sequence synchronization worker handles synchronization in batches of 100 sequences: a) Retrieves sequence values using pg_sequence_state from the publisher. b) Sets sequence values accordingly. c) Updates sequence state to 'READY' in pg_susbcripion_rel d) Commits batches of 100 synchronized sequences. 2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH PUBLICATION (no syntax change): - Stale sequences are removed from pg_subscription_rel. - Newly added sequences in the publisher are added in 'init' state to pg_subscription_rel. - Sequence synchronization will be done by sequence sync worker as listed in subscription creation process. - Sequence synchronization occurs for newly added sequences only. 3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES for refreshing all sequences: - Removes stale sequences and adds newly added sequences from the publisher to pg_subscription_rel. - Resets all sequences in pg_subscription_rel to 'init' state. - Initiates sequence synchronization for all sequences by sequence sync worker as listed in subscription creation process. Regards, Vignesh
Вложения
On Wed, 19 Jun 2024 at 20:33, vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Agreed and I am not sure which is better because there is a value in > > keeping the state name the same for both sequences and tables. We > > probably need more comments in code and doc updates to make the > > behavior clear. We can start with the sequence state as 'init' for > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others > > feel so during the review. > > Here is a patch which does the sequence synchronization in the > following lines from the above discussion: > This commit introduces sequence synchronization during 1) creation of > subscription for initial sync of sequences 2) refresh publication to > synchronize the sequences for the newly created sequences 3) refresh > publication sequences for synchronizing all the sequences. > 1) During subscription creation with CREATE SUBSCRIPTION (no syntax change): > - The subscriber retrieves sequences associated with publications. > - Sequences are added in the 'init' state to the pg_subscription_rel table. > - Sequence synchronization worker will be started if there are any > sequences to be synchronized > - A new sequence synchronization worker handles synchronization in > batches of 100 sequences: > a) Retrieves sequence values using pg_sequence_state from the publisher. > b) Sets sequence values accordingly. > c) Updates sequence state to 'READY' in pg_susbcripion_rel > d) Commits batches of 100 synchronized sequences. > 2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH > PUBLICATION (no syntax change): > - Stale sequences are removed from pg_subscription_rel. > - Newly added sequences in the publisher are added in 'init' state > to pg_subscription_rel. > - Sequence synchronization will be done by sequence sync worker as > listed in subscription creation process. > - Sequence synchronization occurs for newly added sequences only. > 3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION > SEQUENCES for refreshing all sequences: > - Removes stale sequences and adds newly added sequences from the > publisher to pg_subscription_rel. > - Resets all sequences in pg_subscription_rel to 'init' state. > - Initiates sequence synchronization for all sequences by sequence > sync worker as listed in subscription creation process. Here is an updated patch with a few fixes to remove an unused function, changed a few references of table to sequence and added one CHECK_FOR_INTERRUPTS in the sequence sync worker loop. Regards, Vignesh
Вложения
On Wed, Jun 19, 2024 at 8:33 PM vignesh C <vignesh21@gmail.com> wrote: > > On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Agreed and I am not sure which is better because there is a value in > > keeping the state name the same for both sequences and tables. We > > probably need more comments in code and doc updates to make the > > behavior clear. We can start with the sequence state as 'init' for > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others > > feel so during the review. > > Here is a patch which does the sequence synchronization in the > following lines from the above discussion: > Thanks for summarizing the points discussed. I would like to confirm whether the patch replicates new sequences that are created implicitly/explicitly for a publication defined as ALL SEQUENCES. -- With Regards, Amit Kapila.
On Thu, 20 Jun 2024 at 18:45, Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Jun 19, 2024 at 8:33 PM vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > Agreed and I am not sure which is better because there is a value in > > > keeping the state name the same for both sequences and tables. We > > > probably need more comments in code and doc updates to make the > > > behavior clear. We can start with the sequence state as 'init' for > > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others > > > feel so during the review. > > > > Here is a patch which does the sequence synchronization in the > > following lines from the above discussion: > > > > Thanks for summarizing the points discussed. I would like to confirm > whether the patch replicates new sequences that are created > implicitly/explicitly for a publication defined as ALL SEQUENCES. Currently, FOR ALL SEQUENCES publication both explicitly created sequences and implicitly created sequences will be synchronized during the creation of subscriptions (using CREATE SUBSCRIPTION) and refreshing publication sequences(using ALTER SUBSCRIPTION ... REFRESH PUBLICATION SEQUENCES). Therefore, the explicitly created sequence seq1: CREATE SEQUENCE seq1; and the implicitly created sequence seq_test2_c2_seq for seq_test2 table: CREATE TABLE seq_test2 (c1 int, c2 SERIAL); will both be synchronized. Regards, Vignesh
On Thu, 20 Jun 2024 at 18:24, vignesh C <vignesh21@gmail.com> wrote: > > On Wed, 19 Jun 2024 at 20:33, vignesh C <vignesh21@gmail.com> wrote: > > > > On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > Agreed and I am not sure which is better because there is a value in > > > keeping the state name the same for both sequences and tables. We > > > probably need more comments in code and doc updates to make the > > > behavior clear. We can start with the sequence state as 'init' for > > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others > > > feel so during the review. > > > > Here is a patch which does the sequence synchronization in the > > following lines from the above discussion: > > This commit introduces sequence synchronization during 1) creation of > > subscription for initial sync of sequences 2) refresh publication to > > synchronize the sequences for the newly created sequences 3) refresh > > publication sequences for synchronizing all the sequences. > > 1) During subscription creation with CREATE SUBSCRIPTION (no syntax change): > > - The subscriber retrieves sequences associated with publications. > > - Sequences are added in the 'init' state to the pg_subscription_rel table. > > - Sequence synchronization worker will be started if there are any > > sequences to be synchronized > > - A new sequence synchronization worker handles synchronization in > > batches of 100 sequences: > > a) Retrieves sequence values using pg_sequence_state from the publisher. > > b) Sets sequence values accordingly. > > c) Updates sequence state to 'READY' in pg_susbcripion_rel > > d) Commits batches of 100 synchronized sequences. > > 2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH > > PUBLICATION (no syntax change): > > - Stale sequences are removed from pg_subscription_rel. > > - Newly added sequences in the publisher are added in 'init' state > > to pg_subscription_rel. > > - Sequence synchronization will be done by sequence sync worker as > > listed in subscription creation process. > > - Sequence synchronization occurs for newly added sequences only. > > 3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION > > SEQUENCES for refreshing all sequences: > > - Removes stale sequences and adds newly added sequences from the > > publisher to pg_subscription_rel. > > - Resets all sequences in pg_subscription_rel to 'init' state. > > - Initiates sequence synchronization for all sequences by sequence > > sync worker as listed in subscription creation process. > > Here is an updated patch with a few fixes to remove an unused > function, changed a few references of table to sequence and added one > CHECK_FOR_INTERRUPTS in the sequence sync worker loop. Hi Vignesh, I have reviewed the patches and I have following comments: ===== tablesync.c ====== 1. process_syncing_sequences_for_apply can crash with: 2024-06-21 15:25:17.208 IST [3681269] LOG: logical replication apply worker for subscription "test1" has started 2024-06-21 15:28:10.127 IST [3682329] LOG: logical replication sequences synchronization worker for subscription "test1" has started 2024-06-21 15:28:10.146 IST [3682329] LOG: logical replication synchronization for subscription "test1", sequence "s1" has finished 2024-06-21 15:28:10.149 IST [3682329] LOG: logical replication synchronization for subscription "test1", sequence "s2" has finished 2024-06-21 15:28:10.149 IST [3682329] LOG: logical replication sequences synchronization worker for subscription "test1" has finished 2024-06-21 15:29:53.535 IST [3682767] LOG: logical replication sequences synchronization worker for subscription "test1" has started TRAP: failed Assert("nestLevel > 0 && (nestLevel <= GUCNestLevel || (nestLevel == GUCNestLevel + 1 && !isCommit))"), File: "guc.c", Line: 2273, PID: 3682767 postgres: logical replication sequencesync worker for subscription 16389 sync 0 (ExceptionalCondition+0xbb)[0x5b2a61861c99] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (AtEOXact_GUC+0x7b)[0x5b2a618bddfa] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (RestoreUserContext+0xc7)[0x5b2a618a6937] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (+0x1ff7dfa)[0x5b2a61115dfa] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (+0x1ff7eb4)[0x5b2a61115eb4] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (SequencesyncWorkerMain+0x33)[0x5b2a61115fe7] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (BackgroundWorkerMain+0x4ad)[0x5b2a61029cae] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (postmaster_child_launch+0x236)[0x5b2a6102fb36] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (+0x1f1d12a)[0x5b2a6103b12a] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (+0x1f1df0f)[0x5b2a6103bf0f] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (+0x1f1bf71)[0x5b2a61039f71] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (+0x1f16f73)[0x5b2a61034f73] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (PostmasterMain+0x18fb)[0x5b2a61034445] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (+0x1ab1ab8)[0x5b2a60bcfab8] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7b76bc629d90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7b76bc629e40] postgres: logical replication sequencesync worker for subscription 16389 sync 0 (_start+0x25)[0x5b2a601491a5] Analysis: Suppose there are two sequences (s1, s2) on publisher. SO, during initial sync. in loop, + foreach(lc, table_states_not_ready) table_states_not_ready -> it contains both s1 and s2. So, for s1 a sequence sync will be started. It will sync all sequences and the sequence sync worker will exit. Now, for s2 again a sequence sync will start. It will give the above error. Is this loop required? Instead we can just use a bool like 'is_any_sequence_not_ready'. Thoughts? ===== sequencesync.c ===== 2. function name should be 'LogicalRepSyncSequences' instead of 'LogicalRepSyncSeqeunces' 3. In function 'LogicalRepSyncSeqeunces' sequencerel = table_open(seqinfo->relid, RowExclusiveLock);\ There is a extra '\' symbol 4. In function LogicalRepSyncSeqeunces: + ereport(LOG, + errmsg("logical replication synchronization for subscription \"%s\", sequence \"%s\" has finished", + get_subscription_name(subid, false), RelationGetRelationName(sequencerel))); + table_close(sequencerel, NoLock); + + currseq++; + + if (currseq % MAX_SEQUENCES_SYNC_PER_BATCH == 0 || currseq == list_length(sequences)) + CommitTransactionCommand(); The above message gets logged even if the changes are not committed. Suppose the sequence worker exits before commit due to some reason. Thought the log will show that sequence is synced, the sequence will be in 'init' state. I think this is not desirable. Maybe we should log the synced sequences at commit time? Thoughts? ===== General ==== 5. We can use other macros like 'foreach_ptr' instead of 'foreach' Thanks and Regards, Shlok Kyal
On Tue, 25 Jun 2024 at 17:53, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote: > > On Thu, 20 Jun 2024 at 18:24, vignesh C <vignesh21@gmail.com> wrote: > > > > On Wed, 19 Jun 2024 at 20:33, vignesh C <vignesh21@gmail.com> wrote: > > > > > > On Tue, 18 Jun 2024 at 16:10, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > > > > > Agreed and I am not sure which is better because there is a value in > > > > keeping the state name the same for both sequences and tables. We > > > > probably need more comments in code and doc updates to make the > > > > behavior clear. We can start with the sequence state as 'init' for > > > > 'needs-to-be-sycned' and 'ready' for 'synced' and can change if others > > > > feel so during the review. > > > > > > Here is a patch which does the sequence synchronization in the > > > following lines from the above discussion: > > > This commit introduces sequence synchronization during 1) creation of > > > subscription for initial sync of sequences 2) refresh publication to > > > synchronize the sequences for the newly created sequences 3) refresh > > > publication sequences for synchronizing all the sequences. > > > 1) During subscription creation with CREATE SUBSCRIPTION (no syntax change): > > > - The subscriber retrieves sequences associated with publications. > > > - Sequences are added in the 'init' state to the pg_subscription_rel table. > > > - Sequence synchronization worker will be started if there are any > > > sequences to be synchronized > > > - A new sequence synchronization worker handles synchronization in > > > batches of 100 sequences: > > > a) Retrieves sequence values using pg_sequence_state from the publisher. > > > b) Sets sequence values accordingly. > > > c) Updates sequence state to 'READY' in pg_susbcripion_rel > > > d) Commits batches of 100 synchronized sequences. > > > 2) Refreshing sequences with ALTER SUBSCRIPTION ... REFRESH > > > PUBLICATION (no syntax change): > > > - Stale sequences are removed from pg_subscription_rel. > > > - Newly added sequences in the publisher are added in 'init' state > > > to pg_subscription_rel. > > > - Sequence synchronization will be done by sequence sync worker as > > > listed in subscription creation process. > > > - Sequence synchronization occurs for newly added sequences only. > > > 3) Introduce new command ALTER SUBSCRIPTION ... REFRESH PUBLICATION > > > SEQUENCES for refreshing all sequences: > > > - Removes stale sequences and adds newly added sequences from the > > > publisher to pg_subscription_rel. > > > - Resets all sequences in pg_subscription_rel to 'init' state. > > > - Initiates sequence synchronization for all sequences by sequence > > > sync worker as listed in subscription creation process. > > > > Here is an updated patch with a few fixes to remove an unused > > function, changed a few references of table to sequence and added one > > CHECK_FOR_INTERRUPTS in the sequence sync worker loop. > > Hi Vignesh, > > I have reviewed the patches and I have following comments: > > ===== tablesync.c ====== > 1. process_syncing_sequences_for_apply can crash with: > 2024-06-21 15:25:17.208 IST [3681269] LOG: logical replication apply > worker for subscription "test1" has started > 2024-06-21 15:28:10.127 IST [3682329] LOG: logical replication > sequences synchronization worker for subscription "test1" has started > 2024-06-21 15:28:10.146 IST [3682329] LOG: logical replication > synchronization for subscription "test1", sequence "s1" has finished > 2024-06-21 15:28:10.149 IST [3682329] LOG: logical replication > synchronization for subscription "test1", sequence "s2" has finished > 2024-06-21 15:28:10.149 IST [3682329] LOG: logical replication > sequences synchronization worker for subscription "test1" has finished > 2024-06-21 15:29:53.535 IST [3682767] LOG: logical replication > sequences synchronization worker for subscription "test1" has started > TRAP: failed Assert("nestLevel > 0 && (nestLevel <= GUCNestLevel || > (nestLevel == GUCNestLevel + 1 && !isCommit))"), File: "guc.c", Line: > 2273, PID: 3682767 > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (ExceptionalCondition+0xbb)[0x5b2a61861c99] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (AtEOXact_GUC+0x7b)[0x5b2a618bddfa] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (RestoreUserContext+0xc7)[0x5b2a618a6937] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (+0x1ff7dfa)[0x5b2a61115dfa] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (+0x1ff7eb4)[0x5b2a61115eb4] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (SequencesyncWorkerMain+0x33)[0x5b2a61115fe7] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (BackgroundWorkerMain+0x4ad)[0x5b2a61029cae] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (postmaster_child_launch+0x236)[0x5b2a6102fb36] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (+0x1f1d12a)[0x5b2a6103b12a] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (+0x1f1df0f)[0x5b2a6103bf0f] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (+0x1f1bf71)[0x5b2a61039f71] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (+0x1f16f73)[0x5b2a61034f73] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (PostmasterMain+0x18fb)[0x5b2a61034445] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (+0x1ab1ab8)[0x5b2a60bcfab8] > /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7b76bc629d90] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7b76bc629e40] > postgres: logical replication sequencesync worker for subscription > 16389 sync 0 (_start+0x25)[0x5b2a601491a5] > > Analysis: > Suppose there are two sequences (s1, s2) on publisher. > SO, during initial sync. > in loop, > + foreach(lc, table_states_not_ready) > > table_states_not_ready -> it contains both s1 and s2. > So, for s1 a sequence sync will be started. It will sync all sequences > and the sequence sync worker will exit. > Now, for s2 again a sequence sync will start. It will give the above error. > > Is this loop required? Instead we can just use a bool like > 'is_any_sequence_not_ready'. Thoughts? > > ===== sequencesync.c ===== > 2. function name should be 'LogicalRepSyncSequences' instead of > 'LogicalRepSyncSeqeunces' > > 3. In function 'LogicalRepSyncSeqeunces' > sequencerel = table_open(seqinfo->relid, RowExclusiveLock);\ > There is a extra '\' symbol > > 4. In function LogicalRepSyncSeqeunces: > + ereport(LOG, > + errmsg("logical replication synchronization for subscription \"%s\", > sequence \"%s\" has finished", > + get_subscription_name(subid, false), RelationGetRelationName(sequencerel))); > + table_close(sequencerel, NoLock); > + > + currseq++; > + > + if (currseq % MAX_SEQUENCES_SYNC_PER_BATCH == 0 || currseq == > list_length(sequences)) > + CommitTransactionCommand(); > > > The above message gets logged even if the changes are not committed. > Suppose the sequence worker exits before commit due to some reason. > Thought the log will show that sequence is synced, the sequence will > be in 'init' state. I think this is not desirable. > Maybe we should log the synced sequences at commit time? Thoughts? > > ===== General ==== > 5. We can use other macros like 'foreach_ptr' instead of 'foreach' Thanks for the comments, the attached patch has the fixes for the same. Regards, Vignesh
Вложения
Here are my initial review comments for the first patch v20240625-0001. ====== General 1. Missing docs? Section 9.17. "Sequence Manipulation Functions" [1] describes some functions. Shouldn't your new function be documented here also? ~~~ 2. Missing tests? Shouldn't there be some test code that at least executes your new pg_sequence_state function to verify that sane values are returned? ====== Commit Message 3. This patch introduces new functionalities to PostgreSQL: - pg_sequence_state allows retrieval of sequence values using LSN. - SetSequence enables updating sequences with user-specified values. ~ 3a. I didn't understand why this says "using LSN" because IIUC 'lsn' is an output parameter of that function. Don't you mean "... retrieval of sequence values including LSN"? ~ 3b. Does "user-specified" make sense? Is this going to be exposed to a user? How about just "specified"? ====== src/backend/commands/sequence.c 4. SetSequence: +void +SetSequence(Oid seq_relid, int64 value) Would 'new_last_value' be a better parameter name here? ~~~ 5. This new function logic looks pretty similar to the do_setval() function. Can you explain (maybe in the function comment) some info about how and why it differs from that other function? ~~~ 6. I saw that RelationNeedsWAL() is called 2 times. It may make no sense, but is it possible to assign that to a variable 1st time so you don't need to call it 2nd time within the critical section? ~~~ NITPICK - remove junk (') char in comment NITPICK - missing periods (.) in multi-sentence comment ~~~ 7. -read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple) +read_seq_tuple(Relation rel, Buffer *buf, HeapTuple seqdatatuple, + XLogRecPtr *lsn) 7a. The existing parameters were described in the function comment. So, the new 'lsn' parameter should be described here also. ~ 7b. Maybe the new parameter name should be 'lsn_res' or 'lsn_out' or similar to emphasise that this is a returned value. ~~ NITPICK - tweaked comment. YMMV. ~~~ 8. pg_sequence_state: Should you give descriptions of the output parameters in the function header comment? Otherwise, where are they described so called knows what they mean? ~~~ NITPICK - /relid/seq_relid/ NITPICK - declare the variables in the same order as the output parameters NITPICK - An alternative to the memset for nulls is just to use static initialisation "bool nulls[4] = {false, false, false, false};" ====== +extern void SetSequence(Oid seq_relid, int64 value); 9. Would 'SetSequenceLastValue' be a better name for what this function is doing? ====== 99. See also my attached diff which is a top-up patch implementing those nitpicks mentioned above. Please apply any of these that you agree with. ====== [1] https://www.postgresql.org/docs/devel/functions-sequence.html Kind Regards, Peter Smith. Fujitsu Australia