Re: [PATCH] Add `truncate` option to subscription commands

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [PATCH] Add `truncate` option to subscription commands
Дата
Msg-id CAA4eK1JKdGFXWVfoGUEgrUGY0rL8mgnPDKZd0pA6Cdx9oy_mmA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Add `truncate` option to subscription commands  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Ответы Re: [PATCH] Add `truncate` option to subscription commands  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
On Sat, May 22, 2021 at 9:58 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Nov 26, 2020 at 12:16 AM David Christensen <david@endpoint.com> wrote:
> >
> > Hi,
> >
> > At this time I do not have time to make the necessary changes for this
> > commitfest so I am voluntarily withdrawing this patch, but will
> > revisit at a future time.
>
> Hi,
>
> This feature looks useful in the sense that it avoids users having to
> manually lookup all the tables on all the subscribers for truncation
> (in case they want the subscriber tables to exactly sync with the
> publisher tables).
>
> I have gone through the prior discussions on this thread. IMO, we can
> always go ahead with TRUNCATE ... RESTRICT behavior to avoid some
> unnecessary truncation of subscriber local tables (if at all users
> have such tables) that can arise due to CASCADE option. It looks like
> there are some problems with the FK - PK dependencies. Below are my
> thoughts:
>
> 1) Whether a table the sync worker is trying to truncate is having any
> referencing (foreign key) tables on the subscriber? If yes, whether
> all the referencing tables are present in the list of subscription
> tables (output of fetch_table_list)? In this case, the sync worker is
> truncating the primary key/referenced table.
>
> One way to solve the above problem is by storing the table oids of the
> subscription tables (output of fetch_table_list) in a new column in
> the pg_subscription catalog (like subpublications text[] column). In
> the sync worker, before truncation of a table, use
> heap_truncate_find_FKs to get all the referencing tables of the given
> table and get all the subscription tables from the new pg_subscription
> column. If all the referencing tables exist in the subscription
> tables, then truncate the table, otherwise don't, just skip it.
>

Here, silently skipping doesn't seem like a good idea when the user
has asked to truncate the table. Shouldn't we allow it if the user has
provided say cascade with a truncate option?

> There
> can be a problem here if there are many subscription tables, the size
> of the new column in pg_susbcription can be huge. However, we can
> choose to store the table ids in this new column only when the
> truncate option is specified.
>
> Another way is to let each table sync worker scan the
> pg_subscription_rel to get all the relations that belong to a
> subscription. But I felt this was costly.
>

I feel it is better to use pg_subscription_rel especially because we
will do so when the user has given the truncate option and note that
we are already accessing it in sync worker for both reading and
writing. See LogicalRepSyncTableStart.

> 2) Whether a table the sync worker is trying to truncate is a
> referencing table for any of the subscriber tables that is not part of
> the subscription list of tables? In this case, the table the sync
> worker is truncating is the foreign key/referencing table.
>
> This isn't a problem actually, the sync worker can safely truncate the
> table. This is also inline with the current TRUNCATE command
> behaviour.
>
> 3) I think we should allow the truncate option with CREATE
> SUBSCRIPTION, ALTER SUBSCRIPTION ... REFRESH/SET/ADD PUBLICATION,
> basically wherever copy_data and refresh options can be specified. And
> there's no need to store the truncate option in the pg_subscription
> catalogue because we allow it to be specified with only DDLs.
>

makes sense.

One other problem discussed in this thread was what to do when the
same table is part of multiple subscriptions and the user has provided
a truncate option while operating on such a subscription.

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Nancarrow
Дата:
Сообщение: Re: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Logical Replication - behavior of TRUNCATE ... CASCADE