Обсуждение: [PATCH] Support automatic sequence replication
Hello hackers, I'd like to propose an improvement to the sequence replication feature that was committed in [1]. The current implementation synchronizes sequences during initial subscription setup, but the sequence sync worker exits after this initial sync. This means that as sequences advance on the publisher, they drift from the subscriber values over time. Users must manually run ALTER SUBSCRIPTION ... REFRESH SEQUENCES to resynchronize, which requires monitoring and intervention. Proposed Enhancement: This patch changes the sequence sync worker to run continuously throughout the subscription lifetime, automatically detecting and correcting sequence drift. The key changes are: 1. The sequence sync worker remains running instead of exiting after initial sync, periodically checking for and synchronizing drifted sequences. 2. The worker uses an exponential backoff strategy - starting at 2 seconds, doubling up to a maximum of 30 seconds when sequences are in sync, and resetting to the minimum interval when drift is detected. 3. Since synchronization is now automatic, ALTER SUBSCRIPTION ... REFRESH SEQUENCES is no longer needed and has been removed. The patch modifies documentation to reflect the new behavior, removes the REFRESH SEQUENCES command from the grammar and subscription commands, and implements the continuous monitoring loop in sequencesync.c. Tap tests have been updated to verify automatic synchronization rather than manual refresh. The attached v2 patch is attached and ready for review. Thoughts and feedback are welcome! [1] - https://github.com/postgres/postgres/commit/5509055d6956745532e65ab218e15b99d87d66ce Best regards, Ajin Cherian Fujitsu Australia
Вложения
On Tue, Feb 3, 2026 at 9:18 AM Ajin Cherian <itsajin@gmail.com> wrote: > > Hello hackers, > > I'd like to propose an improvement to the sequence replication feature > that was committed in [1]. > > The current implementation synchronizes sequences during initial > subscription setup, but the sequence sync worker exits after this > initial sync. This means that as sequences advance on the publisher, > they drift from the subscriber values over time. Users must manually > run ALTER SUBSCRIPTION ... REFRESH SEQUENCES to resynchronize, which > requires monitoring and intervention. > > Proposed Enhancement: > > This patch changes the sequence sync worker to run continuously > throughout the subscription lifetime, automatically detecting and > correcting sequence drift. The key changes are: > > 1. The sequence sync worker remains running instead of exiting after > initial sync, periodically checking for and synchronizing drifted > sequences. > > 2. The worker uses an exponential backoff strategy - starting at 2 > seconds, doubling up to a maximum of 30 seconds when sequences are in > sync, and resetting to the minimum interval when drift is detected. > > 3. Since synchronization is now automatic, ALTER SUBSCRIPTION ... > REFRESH SEQUENCES is no longer needed and has been removed. > > The patch modifies documentation to reflect the new behavior, removes > the REFRESH SEQUENCES command from the grammar and subscription > commands, and implements the continuous monitoring loop in > sequencesync.c. Tap tests have been updated to verify automatic > synchronization rather than manual refresh. > > The attached v2 patch is attached and ready for review. > > Thoughts and feedback are welcome! > > [1] - https://github.com/postgres/postgres/commit/5509055d6956745532e65ab218e15b99d87d66ce > Thanks for the patch. +1 for the overall idea of patch that once a subscription is created which subscribes to sequences, a sequence sync worker is started which continuously syncs the sequences. This makes usage of REFRESH SEQUENCES redundant and thus it is removed. I am still reviewing the design choice here, and will post my comments soon (if any). By quick validation, few issues in current implementation: 1) If the sequence sync worker exits due to some issue (or killed or server restarts), sequence-sync worker is not started again by apply worker unless there is a sequence in INIT state i.e. synchronization of sequences in READY state stops. IIUC, the logic of ProcessSequencesForSync() needs to change to start seq sync worker irrespective of state of sequences. 2) There is some issue in how LOGs (DEBUGs) are getting generated. a) Even if there is no drift, it still keeps on dumping: "logical replication sequence synchronization for subscription "sub1" - total unsynchronized: 3" b) When there is a drift in say single sequence, it puts rest (which are in sync) to "missing" section: "logical replication sequence synchronization for subscription "sub1" - batch #1 = 3 attempted, 1 succeeded, 0 mismatched, 0 insufficient permission, 2 missing from publisher, 0 skipped" 3) If a sequence sync worker is taking a nap, and subscription is disabled or the server is stopped just before upgrade, how is the user supposed to know that sequences are synced at the end? thanks Shveta
On Tue, Feb 3, 2026 at 9:22 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Feb 3, 2026 at 9:18 AM Ajin Cherian <itsajin@gmail.com> wrote: > Thanks for the patch. > > +1 for the overall idea of patch that once a subscription is created > which subscribes to sequences, a sequence sync worker is started which > continuously syncs the sequences. This makes usage of REFRESH > SEQUENCES redundant and thus it is removed. I am still reviewing the > design choice here, and will post my comments soon (if any). > Thanks! > By quick validation, few issues in current implementation: > > 1) > If the sequence sync worker exits due to some issue (or killed or > server restarts), sequence-sync worker is not started again by apply > worker unless there is a sequence in INIT state i.e. synchronization > of sequences in READY state stops. IIUC, the logic of > ProcessSequencesForSync() needs to change to start seq sync worker > irrespective of state of sequences. > Yes, I fixed this. I've changed FetchRelationStates to fetch sequences in ANY state and not just ones in NON READY state. > 2) > There is some issue in how LOGs (DEBUGs) are getting generated. > > a) Even if there is no drift, it still keeps on dumping: > "logical replication sequence synchronization for subscription "sub1" > - total unsynchronized: 3" > Removed this. > b) > When there is a drift in say single sequence, it puts rest (which are > in sync) to "missing" section: > "logical replication sequence synchronization for subscription "sub1" > - batch #1 = 3 attempted, 1 succeeded, 0 mismatched, 0 insufficient > permission, 2 missing from publisher, 0 skipped" > Fixed, and added a new section called "no drift". Also now this debug message is printed every time the worker attempts to synchronize sequences. also mentioning the state of the sequences being synced. > 3) > If a sequence sync worker is taking a nap, and subscription is > disabled or the server is stopped just before upgrade, how is the user > supposed to know that sequences are synced at the end? Well, one way is to wait for a debug message that says that all the attempted sequences are in the "no drift" state. Also remote sequence's LSN is updated in pg_subscription_rel for each sequence. Let me know if you have anything more in mind. One option is to leave the ALTER SUBSCRIPTION..REFRESH SEQUENCE in place, that will change the state of all the sequences to the INIT state, and the user can then wait for the sequences to change state to READY. Attaching patch v3 addressing the above comments. regards, Ajin Cherian Fujitsu Australia
Вложения
We revisited the design of this patch. Sharing my thoughts and analysis here. Any feedback is appreciated. Background: ----------------------- Previously, sequence synchronization was triggered during CREATE SUBSCRIPTION, ALTER SUBSCRIPTION REFRESH PUBLICATION, and REFRESH SEQUENCES. A sequence-sync worker was started whenever a sequence entered the INIT state, which could also occur if a previous sync failed. Therefore, a mechanism was required to continuously scan pg_subscription_rel and start a sequence-sync worker for a subscription whenever any sequence was found in INIT. Since the apply worker already performs this role for table-sync workers, the same infrastructure was reused for sequence-sync workers. Using the launcher for this purpose was rejected, as it would have required overloading the launcher with logic to repeatedly inspect pg_subscription_rel and decide whether to start a worker for each sequence (see discussion at [1]). Current scenario: ----------------------- The requirement is different now: the sequence-sync worker is now expected to run continuously, independent of sequence state. This makes us revisit our design choices and re-analyze whether we can do it in the launcher. The primary benefit of starting the sequence-sync worker from the launcher would be avoiding an extra apply worker for sequence-only subscriptions. However, this approach introduces challenges. The launcher currently accesses only global pg_subscription and does not establish a database connection (see [2]). To decide whether to start an apply worker, a sequence-sync worker, or both, the launcher would need to access pg_subscription_rel, which requires a database connection. It is unclear which database the launcher should connect to, since subscriptions can target different databases. Another option would be to explicitly feed this information to the launcher during CREATE SUBSCRIPTION and REFRESH PUBLICATION by having an additional column in pg_subscription indicating object_type: table_only, seq_only, both. This would undoubtedly add complexity. Also, I am unsure if it is a good idea to add an additional field to global catalog pg_subscription for this purpose. That said, it is reasonable to expect that users who create a publication for ALL SEQUENCES will typically have only a single publication–subscription pair. In such cases, the overhead of an extra apply worker per subscription, along with a sequence-sync worker, is likely acceptable. ~~ Considering the above, starting the sequence-sync worker from the launcher seems feasible ((though it would require a more detailed analysis), but it comes with its own complexities. OTOH, (potential) significant extra worker overhead, which could impact the system, would only occur if a large number of 'sequence-only' subscriptions were created. It is unclear whether there is ever a need for multiple ALL-SEQUENCE subscriptions, or whether business requirements would need subscribing to multiple machines for ALL-SEQUENCES, which would necessitate multiple such subscriptions. Given this, it seems reasonable to continue with the current design of starting the sequence-sync worker from the apply worker. We may think of other approaches if there is any objection or user-feedback for this approach. ~~ [1]: https://www.postgresql.org/message-id/CAA4eK1%2Bp%3DM%2B5NAq5VSxD4_XyE1MBTKwU40RD1cL9PgpbELKBRQ%40m… [2]: /* * Establish connection to nailed catalogs (we only ever access * pg_subscription). */ BackgroundWorkerInitializeConnection(NULL, NULL, 0); thanks Shveta