RE: Conflict detection for update_deleted in logical replication
От | Zhijie Hou (Fujitsu) |
---|---|
Тема | RE: Conflict detection for update_deleted in logical replication |
Дата | |
Msg-id | TY4PR01MB169079F275EE027F29E2D24889438A@TY4PR01MB16907.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Conflict detection for update_deleted in logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
Список | pgsql-hackers |
On Tuesday, August 26, 2025 2:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Mon, Aug 25, 2025 at 5:05 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > > A few comments on 0001: > > > > Some more comments: Thanks for the comments! > 1. > + /* > + * Return false if the leader apply worker has stopped retaining > + * information for detecting conflicts. This implies that > + update_deleted > + * can no longer be reliably detected. > + */ > + if (!retention_active) > + return false; > + > /* > * For conflict detection, we use the conflict slot's xmin value instead > * of invoking GetOldestNonRemovableTransactionId(). The slot.xmin acts as > @@ -3254,7 +3315,15 @@ FindDeletedTupleInLocalRel(Relation localrel, Oid > localidxoid, > oldestxmin = slot->data.xmin; > SpinLockRelease(&slot->mutex); > > - Assert(TransactionIdIsValid(oldestxmin)); > + /* > + * Return false if the conflict detection slot.xmin is set to > + * InvalidTransactionId. This situation arises if the current worker is > + * either a table synchronization or parallel apply worker, and the > + leader > + * stopped retention immediately after checking the > + * oldest_nonremovable_xid above. > + */ > + if (!TransactionIdIsValid(oldestxmin)) > + return false; > > If the current worker is tablesync or parallel_apply, it should have exited from > the above check of retention_active as we get the leader's > oldest_nonremovable_xid to decide that. What am, I missing? This made me > wonder whether we need to use slot's xmin after we have fetched leader's > oldest_nonremovable_xid to find deleted tuple? There was a race condition that the leader could stop retention immediately after the pa worker fetches its oldest_nonremovable_xid. But I agree that it would be simpler to directly use oldest_nonremovable_xid to find deleted tuple, so changed. This way, the logic can be simplified. > > 2. > - * The interval is reset to a minimum value of 100ms once there is some > - * activity on the node. > + * The interval is reset to the lesser of 100ms and > + * max_conflict_retention_duration once there is some activities on the node. > AFAICS, this is not adhered in the code because you are using it when there is > no activity aka when new_xid_found is false. IS the comment wrong or code > needs some updation? I think both needs to be updated. I have adjusted the code to consider max_retention in both cases and update the comments. All the comments have been addressed in V66 patch set. Best Regards, Hou zj
В списке pgsql-hackers по дате отправления: