Re: Conflict detection and logging in logical replication

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Conflict detection and logging in logical replication
Дата
Msg-id CAA4eK1JkFT9w9eQ+GwfQxJBNLwUaAzhR7L=b-=waWXfPnvR+kQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Conflict detection and logging in logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Thu, Jul 25, 2024 at 4:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Jul 25, 2024 at 12:04 PM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Here is the V6 patch set which addressed Shveta and Nisha's comments
> > in [1][2][3][4].
> >
>
> Do we need an option detect_conflict for logging conflicts? The
> possible reason to include such an option is to avoid any overhead
> during apply due to conflict detection. IIUC, to detect some of the
> conflicts like update_differ and delete_differ, we would need to fetch
> commit_ts information which could be costly but we do that only when
> GUC track_commit_timestamp is enabled which would anyway have overhead
> on its own. Can we do performance testing to see how much additional
> overhead we have due to fetching commit_ts information during conflict
> detection?
>
> The other time we need to enquire commit_ts is to log the conflict
> detection information which is an ERROR path, so performance shouldn't
> matter in this case.
>
> In general, it would be good to enable conflict detection/logging by
> default but if it has overhead then we can consider adding this new
> option. Anyway, adding an option could be a separate patch (at least
> for review), let the first patch be the core code of conflict
> detection and logging.
>
> minor cosmetic comments:
> 1.
> +static void
> +check_conflict_detection(void)
> +{
> + if (!track_commit_timestamp)
> + ereport(WARNING,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("conflict detection could be incomplete due to disabled
> track_commit_timestamp"),
> + errdetail("Conflicts update_differ and delete_differ cannot be
> detected, and the origin and commit timestamp for the local row will
> not be logged."));
> +}
>
> The errdetail string is too long. It would be better to split it into
> multiple rows.
>
> 2.
> -
> +static void check_conflict_detection(void);
>
> Spurious line removal.
>

A few more comments:
1.
For duplicate key, the patch reports conflict as following:
ERROR:  conflict insert_exists detected on relation "public.t1"
2024-07-26 11:06:34.570 IST [27800] DETAIL:  Key (c1)=(1) already
exists in unique index "t1_pkey", which was modified by origin 1 in
transaction 770 at 2024-07-26 09:16:47.79805+05:30.
2024-07-26 11:06:34.570 IST [27800] CONTEXT:  processing remote data
for replication origin "pg_16387" during message type "INSERT" for
replication target relation "public.t1" in transaction 742, finished
at 0/151A108

In detail, it is better to display the origin name instead of the
origin id. This will be similar to what we do in CONTEXT information.

2.
if (resultRelInfo->ri_NumIndices > 0)
  recheckIndexes = ExecInsertIndexTuples(resultRelInfo,
-    slot, estate, false, false,
-    NULL, NIL, false);
+    slot, estate, false,
+    conflictindexes, &conflict,

It is better to use true/false for the bool parameter (something like
conflictindexes ? true : false). That will make the code easier to
follow.

3. The need for ReCheckConflictIndexes() is not clear from comments.
Can you please add a few comments to explain this?

4.
-   will simply be skipped.
+   will simply be skipped. Please refer to <link
linkend="sql-createsubscription-params-with-detect-conflict"><literal>detect_conflict</literal></link>
+   for all the conflicts that will be logged when enabling
<literal>detect_conflict</literal>.
   </para>

It would be easier to read the patch if you move <link .. to the next line.

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: Built-in CTYPE provider
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Allow logical failover slots to wait on synchronous replication