Обсуждение: [PATCH] Support automatic sequence replication

Поиск
Список
Период
Сортировка

[PATCH] Support automatic sequence replication

От
Ajin Cherian
Дата:
Hello hackers,

I'd like to propose an improvement to the sequence replication feature
that was committed in [1].

The current implementation synchronizes sequences during initial
subscription setup, but the sequence sync worker exits after this
initial sync. This means that as sequences advance on the publisher,
they drift from the subscriber values over time. Users must manually
run ALTER SUBSCRIPTION ... REFRESH SEQUENCES to resynchronize, which
requires monitoring and intervention.

Proposed Enhancement:

This patch changes the sequence sync worker to run continuously
throughout the subscription lifetime, automatically detecting and
correcting sequence drift. The key changes are:

1. The sequence sync worker remains running instead of exiting after
initial sync, periodically checking for and synchronizing drifted
sequences.

2. The worker uses an exponential backoff strategy - starting at 2
seconds, doubling up to a maximum of 30 seconds when sequences are in
sync, and resetting to the minimum interval when drift is detected.

3. Since synchronization is now automatic, ALTER SUBSCRIPTION ...
REFRESH SEQUENCES is no longer needed and has been removed.

The patch modifies documentation to reflect the new behavior, removes
the REFRESH SEQUENCES command from the grammar and subscription
commands, and implements the continuous monitoring loop in
sequencesync.c. Tap tests have been updated to verify automatic
synchronization rather than manual refresh.

The attached v2 patch is attached and ready for review.

Thoughts and feedback are welcome!

[1] - https://github.com/postgres/postgres/commit/5509055d6956745532e65ab218e15b99d87d66ce

Best regards,
Ajin Cherian
Fujitsu Australia

Вложения

Re: [PATCH] Support automatic sequence replication

От
shveta malik
Дата:
On Tue, Feb 3, 2026 at 9:18 AM Ajin Cherian <itsajin@gmail.com> wrote:
>
> Hello hackers,
>
> I'd like to propose an improvement to the sequence replication feature
> that was committed in [1].
>
> The current implementation synchronizes sequences during initial
> subscription setup, but the sequence sync worker exits after this
> initial sync. This means that as sequences advance on the publisher,
> they drift from the subscriber values over time. Users must manually
> run ALTER SUBSCRIPTION ... REFRESH SEQUENCES to resynchronize, which
> requires monitoring and intervention.
>
> Proposed Enhancement:
>
> This patch changes the sequence sync worker to run continuously
> throughout the subscription lifetime, automatically detecting and
> correcting sequence drift. The key changes are:
>
> 1. The sequence sync worker remains running instead of exiting after
> initial sync, periodically checking for and synchronizing drifted
> sequences.
>
> 2. The worker uses an exponential backoff strategy - starting at 2
> seconds, doubling up to a maximum of 30 seconds when sequences are in
> sync, and resetting to the minimum interval when drift is detected.
>
> 3. Since synchronization is now automatic, ALTER SUBSCRIPTION ...
> REFRESH SEQUENCES is no longer needed and has been removed.
>
> The patch modifies documentation to reflect the new behavior, removes
> the REFRESH SEQUENCES command from the grammar and subscription
> commands, and implements the continuous monitoring loop in
> sequencesync.c. Tap tests have been updated to verify automatic
> synchronization rather than manual refresh.
>
> The attached v2 patch is attached and ready for review.
>
> Thoughts and feedback are welcome!
>
> [1] - https://github.com/postgres/postgres/commit/5509055d6956745532e65ab218e15b99d87d66ce
>

Thanks for the patch.

+1 for the overall idea of patch that once a subscription is created
which subscribes to sequences, a sequence sync worker is started which
continuously syncs the sequences. This makes usage of REFRESH
SEQUENCES redundant and thus it is removed. I am still reviewing the
design choice here, and will post my comments soon (if any).

By quick validation, few issues in current implementation:

1)
If the sequence sync worker exits due to some issue (or killed or
server restarts), sequence-sync worker is not started again by apply
worker unless there is a sequence in INIT state i.e. synchronization
of sequences in READY state stops. IIUC, the logic of
ProcessSequencesForSync() needs to change to start seq sync worker
irrespective of state of sequences.

2)
There is some issue in how LOGs (DEBUGs) are getting generated.

a) Even if there is no drift, it still keeps on dumping:
"logical replication sequence synchronization for subscription "sub1"
- total unsynchronized: 3"

b)
When there is a drift in say single sequence, it puts rest (which are
in sync) to "missing" section:
"logical replication sequence synchronization for subscription "sub1"
- batch #1 = 3 attempted, 1 succeeded, 0 mismatched, 0 insufficient
permission, 2 missing from publisher, 0 skipped"

3)
If a sequence sync worker is taking a nap, and subscription is
disabled or the server is stopped just before upgrade, how is the user
supposed to know that sequences are synced at the end?

thanks
Shveta



Re: [PATCH] Support automatic sequence replication

От
Ajin Cherian
Дата:
On Tue, Feb 3, 2026 at 9:22 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Feb 3, 2026 at 9:18 AM Ajin Cherian <itsajin@gmail.com> wrote:
> Thanks for the patch.
>
> +1 for the overall idea of patch that once a subscription is created
> which subscribes to sequences, a sequence sync worker is started which
> continuously syncs the sequences. This makes usage of REFRESH
> SEQUENCES redundant and thus it is removed. I am still reviewing the
> design choice here, and will post my comments soon (if any).
>

Thanks!

> By quick validation, few issues in current implementation:
>
> 1)
> If the sequence sync worker exits due to some issue (or killed or
> server restarts), sequence-sync worker is not started again by apply
> worker unless there is a sequence in INIT state i.e. synchronization
> of sequences in READY state stops. IIUC, the logic of
> ProcessSequencesForSync() needs to change to start seq sync worker
> irrespective of state of sequences.
>

Yes, I fixed this. I've changed FetchRelationStates to fetch sequences
in ANY state and not just ones in NON READY state.

> 2)
> There is some issue in how LOGs (DEBUGs) are getting generated.
>
> a) Even if there is no drift, it still keeps on dumping:
> "logical replication sequence synchronization for subscription "sub1"
> - total unsynchronized: 3"
>

Removed this.

> b)
> When there is a drift in say single sequence, it puts rest (which are
> in sync) to "missing" section:
> "logical replication sequence synchronization for subscription "sub1"
> - batch #1 = 3 attempted, 1 succeeded, 0 mismatched, 0 insufficient
> permission, 2 missing from publisher, 0 skipped"
>

Fixed, and added a new section called "no drift". Also now this debug
message is printed every time the worker attempts to synchronize
sequences. also mentioning the state of the sequences being synced.

> 3)
> If a sequence sync worker is taking a nap, and subscription is
> disabled or the server is stopped just before upgrade, how is the user
> supposed to know that sequences are synced at the end?

Well, one way is to wait for a debug message that says that all the
attempted sequences are in the "no drift" state. Also remote
sequence's LSN is updated in pg_subscription_rel for each sequence.
Let me know if you have anything more in mind. One option is to leave
the ALTER SUBSCRIPTION..REFRESH SEQUENCE in place, that will change
the state of all the sequences to the INIT state, and the user can
then wait for the sequences to change state to READY.

Attaching patch v3 addressing the above comments.

regards,
Ajin Cherian
Fujitsu Australia

Вложения

Re: [PATCH] Support automatic sequence replication

От
shveta malik
Дата:
We revisited the design of this patch. Sharing my thoughts and
analysis here. Any feedback is appreciated.

Background:
-----------------------
Previously, sequence synchronization was triggered during CREATE
SUBSCRIPTION, ALTER SUBSCRIPTION REFRESH PUBLICATION, and  REFRESH
SEQUENCES. A sequence-sync worker was started whenever a sequence
entered the INIT state, which could also occur if a previous sync
failed.

Therefore, a mechanism was required to continuously scan
pg_subscription_rel and start a sequence-sync worker for a
subscription whenever any sequence was found in INIT. Since the apply
worker already performs this role for table-sync workers, the same
infrastructure was reused for sequence-sync workers. Using the
launcher for this purpose was rejected, as it would have required
overloading the launcher with logic to repeatedly inspect
pg_subscription_rel and decide whether to start a worker for each
sequence (see discussion at [1]).

Current scenario:
-----------------------
The requirement is different now: the sequence-sync worker is now
expected to run continuously, independent of sequence state. This
makes us revisit our design choices and re-analyze whether we can do
it in the launcher.

The primary benefit of starting the sequence-sync worker from the
launcher would be avoiding an extra apply worker for sequence-only
subscriptions.  However, this approach introduces challenges. The
launcher currently accesses only global pg_subscription and does not
establish a database connection (see [2]). To decide whether to start
an apply worker, a sequence-sync worker, or both, the launcher would
need to access pg_subscription_rel, which requires a database
connection. It is unclear which database the launcher should connect
to, since subscriptions can target different databases. Another option
would be to explicitly feed this information to the launcher during
CREATE SUBSCRIPTION and REFRESH PUBLICATION by having an additional
column in pg_subscription indicating object_type: table_only,
seq_only, both.  This would undoubtedly add complexity. Also, I am
unsure if it is a good idea to add an additional field to global
catalog pg_subscription for this purpose.

That said, it is reasonable to expect that users who create a
publication for ALL SEQUENCES will typically have only a single
publication–subscription pair. In such cases, the overhead of an extra
apply worker per subscription, along with a sequence-sync worker, is
likely acceptable.
~~

Considering the above, starting the sequence-sync worker from the
launcher seems feasible ((though it would require a more detailed
analysis), but it comes with its own complexities. OTOH, (potential)
significant extra worker overhead, which could impact the system,
would only occur if a large number of 'sequence-only' subscriptions
were created. It is unclear whether there is ever a need for multiple
ALL-SEQUENCE subscriptions, or whether business requirements would
need subscribing to multiple machines for ALL-SEQUENCES, which would
necessitate multiple such subscriptions.

Given this, it seems reasonable to continue with the current design of
starting the sequence-sync worker from the apply worker. We may think
of other approaches if there is any objection or user-feedback for
this approach.

~~

[1]: https://www.postgresql.org/message-id/CAA4eK1%2Bp%3DM%2B5NAq5VSxD4_XyE1MBTKwU40RD1cL9PgpbELKBRQ%40m…
[2]:
   /*
         * Establish connection to nailed catalogs (we only ever access
         * pg_subscription).
         */
        BackgroundWorkerInitializeConnection(NULL, NULL, 0);

thanks
Shveta