Re: Logical Replication of sequences
От | vignesh C |
---|---|
Тема | Re: Logical Replication of sequences |
Дата | |
Msg-id | CALDaNm0QCp8ZP2n5rn3c4uB6YSpGQvWAxRTF9JYqk7hFjD9yPw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Logical Replication of sequences (shveta malik <shveta.malik@gmail.com>) |
Список | pgsql-hackers |
On Fri, 11 Jul 2025 at 14:26, shveta malik <shveta.malik@gmail.com> wrote: > > On Wed, Jul 9, 2025 at 4:11 PM vignesh C <vignesh21@gmail.com> wrote: > > > > > > 3) > > > SyncFetchRelationStates: > > > Earlier the name was FetchTableStates. If we really want to use the > > > 'Sync' keyword, we can name it FetchRelationSyncStates, as we are > > > fetching sync-status only. Thoughts? > > > > Instead of FetchRelationSyncStates, I preferred FetchRelationStates, > > and changed it to FetchRelationStates. > > > > Okay, LGTM. > > > > 5) > > > + if (!MyLogicalRepWorker->sequencesync_failure_time || > > > + TimestampDifferenceExceeds(MyLogicalRepWorker->sequencesync_failure_time, > > > + now, wal_retrieve_retry_interval)) > > > + { > > > + MyLogicalRepWorker->sequencesync_failure_time = 0; > > > + > > > + logicalrep_worker_launch(WORKERTYPE_SEQUENCESYNC, > > > + MyLogicalRepWorker->dbid, > > > + MySubscription->oid, > > > + MySubscription->name, > > > + MyLogicalRepWorker->userid, > > > + InvalidOid, > > > + DSM_HANDLE_INVALID); > > > + break; > > > + } > > > > > > We set sequencesync_failure_time to 0, but if logicalrep_worker_launch > > > is not able to launch the worker due to some reason, next time it will > > > not even wait for 'wal_retrieve_retry_interval time' to attempt > > > restarting it again. Is that intentional? > > > > > > In other workflows such as while launching table-sync or apply worker, > > > this scenario does not arise. This is because we maintain start_time > > > there which can never be 0 instead of failure time and before > > > attempting to start the workers, we set start_time to current time. > > > The seq-sync failure-time OTOH is only set to non-null in > > > logicalrep_seqsyncworker_failure() and it is not necessary that we > > > will hit that function as the logicalrep_worker_launch() may fail > > > before that itself. Do you think we shall maintain start-time instead > > > of failure-time for seq-sync worker as well? Or is there any other way > > > to handle it? > > > > I preferred the suggestion from [1]. Modified it accordingly. > > Okay, works for me. > > > The attached v20250709 version patch has the changes for the same. > > > > Thank You for the patches. Please find a few comments: > > 1) > Shall we update pg_publication doc as well to indicate that pubinsert, > pubupdate, pubdelete , pubtruncate, pubviaroot are meaningful only > when publishing tables. For sequences, these have no meaning. Since it is clearly mentioned it is for tables, I felt no need to mention again it is not applicable for sequences. > 2) > Shall we have walrcv_disconnect() after copy is done in > LogicalRepSyncSequences() There is a cleanup function registered for the worker to handle this at the worker exit. So this is not required. > 3) > Do we really need for-loop in ProcessSyncingSequencesForApply? I think > this function is inspired from ProcessSyncingTablesForApply() but > there we need different tablesync workers for different tables. For > sequences, that is not the case and thus for-loop can be omitted. If > we do so, we can amend the comments too where it says " Walk over all > subscription sequences....." Handled in the previous version > 4) > +# Confirm that the warning for parameters differing is logged. > +$node_subscriber->wait_for_log( > > We can drop regress_seq_sub on the publisher now and check for missing > warnings as the next step. Modified > 5) > I am revisiting the test given in [1], I see there is some document change as: > > + Incremental sequence changes are not replicated. Although the data in > + serial or identity columns backed by sequences will be replicated as part > + of the table, the sequences themselves do not replicate ongoing changes. > + On the subscriber, a sequence will retain the last value it synchronized > + from the publisher. If the subscriber is used as a read-only database, > + then this should typically not be a problem. If, however, some kind of > + switchover or failover to the subscriber database is intended, then the > + sequences would need to be updated to the latest values, either by > + executing <link > linkend="sql-altersubscription-params-refresh-publication-sequences"> > + <command>ALTER SUBSCRIPTION ... REFRESH PUBLICATION > SEQUENCES</command></link> > + or by copying the current data from the publisher (perhaps using > + <command>pg_dump</command>) or by determining a sufficiently high value > + from the tables themselves. > > But this doc specifically mentions a failover case. It does not > mention the case presented in [1] i.e. if user is trying to use > sequence to populate identity column of a "subscribed" table where the > sequence is also synced originally from publisher, then he may end up > with corrupted state > of IDENTITY column, and thus such cases should be used with caution. > Please review once and see if we need to mention this and the example > too. In this case, the identity column data—as well as the non-identity columns—will be sent by the publisher as part of the row data. This behavior appears consistent with how non-sequence objects are handled in a publication. The following documentation note should be sufficient, as it already clarifies that "it will retain the last value it synchronized from the publisher": On the subscriber, a sequence will retain the last value it synchronized from the publisher. If the subscriber is used as a read-only database, this should typically not pose a problem. But if you have something in mind which should be added let me know. The attached patch has the changes for the same. Regards, Vignesh
Вложения
- v20250714-0001-Introduce-pg_sequence_state-function-for-e.patch
- v20250714-0004-Introduce-REFRESH-PUBLICATION-SEQUENCES-fo.patch
- v20250714-0003-Reorganize-tablesync-Code-and-Introduce-sy.patch
- v20250714-0005-New-worker-for-sequence-synchronization-du.patch
- v20250714-0002-Introduce-ALL-SEQUENCES-support-for-Postgr.patch
- v20250714-0006-Documentation-for-sequence-synchronization.patch
В списке pgsql-hackers по дате отправления: